TY - GEN
T1 - JSidentify
T2 - 42nd ACM/IEEE International Conference on Software Engineering: Software Engineering in Practice, ICSE-SEIP 2020
AU - Xia, Qun
AU - Zhou, Zhongzhu
AU - Li, Zhihao
AU - Xu, Bin
AU - Zou, Wei
AU - Chen, Zishun
AU - Ma, Huafeng
AU - Liang, Gangqiang
AU - Lu, Haochuan
AU - Guo, Shiyu
AU - Xiong, Ting
AU - Deng, Yuetang
AU - Xie, Tao
N1 - Funding Information:
∗The research done by this author was during his internship at Tencent Inc. His work is supported in part by the National Natural Science Foundation of China under Grant U1911201, Guangdong Special Support Program under Grant 2017TX04X148. †The author is affiliated with Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education. His work is supported in part by NSF under grant no. CNS-1564274, CCF-1816615.
Publisher Copyright:
© 2020 IEEE Computer Society. All rights reserved.
PY - 2020/6/27
Y1 - 2020/6/27
N2 - Online mini games are lightweight game apps, typically implemented in JavaScript (JS), that run inside another host mobile app (such asWeChat, Baidu, and Alipay). These mini games do not need to be downloaded or upgraded through an app store, making it possible for one host mobile app to perform the aggregated services of many apps. Hundreds of millions of users play tens of thousands of mini games, which make a great profit, and consequently are popular targets of plagiarism. In cases of plagiarism, deeply obfuscated code cloned from the original code often embodies malicious code segments and copyright infringements, posing great challenges for existing plagiarism detection tools. To address these challenges, in this paper, we design and implement JSidentify, a hybrid framework to detect plagiarism among online mini games. JSidentify includes three techniques based on different levels of code abstraction. JSidentify applies the included techniques in the constructed priority list one by one to reduce overall detection time. Our evaluation results show that JSidentify outperforms other existing related state-of-the-art approaches and achieves the best precision and recall with affordable detection time when detecting plagiarism among online mini games and clones among general JS programs. Our deployment experience of JSidentify also shows that JSidentify is indispensable in the daily operations of online mini games in WeChat.
AB - Online mini games are lightweight game apps, typically implemented in JavaScript (JS), that run inside another host mobile app (such asWeChat, Baidu, and Alipay). These mini games do not need to be downloaded or upgraded through an app store, making it possible for one host mobile app to perform the aggregated services of many apps. Hundreds of millions of users play tens of thousands of mini games, which make a great profit, and consequently are popular targets of plagiarism. In cases of plagiarism, deeply obfuscated code cloned from the original code often embodies malicious code segments and copyright infringements, posing great challenges for existing plagiarism detection tools. To address these challenges, in this paper, we design and implement JSidentify, a hybrid framework to detect plagiarism among online mini games. JSidentify includes three techniques based on different levels of code abstraction. JSidentify applies the included techniques in the constructed priority list one by one to reduce overall detection time. Our evaluation results show that JSidentify outperforms other existing related state-of-the-art approaches and achieves the best precision and recall with affordable detection time when detecting plagiarism among online mini games and clones among general JS programs. Our deployment experience of JSidentify also shows that JSidentify is indispensable in the daily operations of online mini games in WeChat.
KW - Clone Detection
KW - JavaScript
KW - Online Mini Games
KW - Plagiarism Detection
UR - http://www.scopus.com/inward/record.url?scp=85092557644&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85092557644&partnerID=8YFLogxK
U2 - 10.1145/3377813.3381352
DO - 10.1145/3377813.3381352
M3 - Conference contribution
AN - SCOPUS:85092557644
T3 - Proceedings - International Conference on Software Engineering
SP - 211
EP - 220
BT - Proceedings - 2020 ACM/IEEE 42nd International Conference on Software Engineering
PB - IEEE Computer Society
Y2 - 27 June 2020 through 19 July 2020
ER -