TY - GEN
T1 - A Large-scale Study on API Misuses in the Wild
AU - Li, Xia
AU - Jiang, Jiajun
AU - Benton, Samuel
AU - Xiong, Yingfei
AU - Zhang, Lingming
N1 - Funding Information:
ACKNOWLEDGEMENTS This work was partially supported by National Key Research and Development Program of China under Grant No. SQ2019YFE010068, National Science Foundation under Grant Nos. CCF-1763906 and CCF-1942430, Alibaba, and National Natural Science Foundation of China under No. 61922003.
Publisher Copyright:
© 2021 IEEE.
PY - 2021/4
Y1 - 2021/4
N2 - API misuses are prevalent and extremely harmful. Despite various techniques have been proposed for API-misuse detection, it is not even clear how different types of API misuses distribute and whether existing techniques have covered all major types of API misuses. Therefore, in this paper, we conduct the first large-scale empirical study on API misuses based on 528,546 historical bug-fixing commits from GitHub (from 2011 to 2018). By leveraging a state-of-the-art fine-grained AST differencing tool, GumTree, we extract more than one million bug-fixing edit operations, 51.7% of which are API misuses. We further systematically classify API misuses into nine different categories according to the edit operations and context. We also extract various frequent API-misuse patterns based on the categories and corresponding operations, which can be complementary to existing API-misuse detection tools. Our study reveals various practical guidelines regarding the importance of different types of API misuses. Furthermore, based on our dataset, we perform a user study to manually analyze the usage constraints of 10 patterns to explore whether the mined patterns can guide the design of future API-misuse detection tools. Specifically, we find that 7,541 potential misuses still exist in latest Apache projects and 149 of them have been reported to developers. To date, 57 have already been confirmed and fixed (with 15 rejected misuses correspondingly). The results indicate the importance of studying historical API misuses and the promising future of employing our mined patterns for detecting unknown API misuses.
AB - API misuses are prevalent and extremely harmful. Despite various techniques have been proposed for API-misuse detection, it is not even clear how different types of API misuses distribute and whether existing techniques have covered all major types of API misuses. Therefore, in this paper, we conduct the first large-scale empirical study on API misuses based on 528,546 historical bug-fixing commits from GitHub (from 2011 to 2018). By leveraging a state-of-the-art fine-grained AST differencing tool, GumTree, we extract more than one million bug-fixing edit operations, 51.7% of which are API misuses. We further systematically classify API misuses into nine different categories according to the edit operations and context. We also extract various frequent API-misuse patterns based on the categories and corresponding operations, which can be complementary to existing API-misuse detection tools. Our study reveals various practical guidelines regarding the importance of different types of API misuses. Furthermore, based on our dataset, we perform a user study to manually analyze the usage constraints of 10 patterns to explore whether the mined patterns can guide the design of future API-misuse detection tools. Specifically, we find that 7,541 potential misuses still exist in latest Apache projects and 149 of them have been reported to developers. To date, 57 have already been confirmed and fixed (with 15 rejected misuses correspondingly). The results indicate the importance of studying historical API misuses and the promising future of employing our mined patterns for detecting unknown API misuses.
KW - Code abstraction
KW - Pattern generation
KW - Program adaptation
UR - http://www.scopus.com/inward/record.url?scp=85107906938&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85107906938&partnerID=8YFLogxK
U2 - 10.1109/ICST49551.2021.00034
DO - 10.1109/ICST49551.2021.00034
M3 - Conference contribution
AN - SCOPUS:85107906938
T3 - Proceedings - 2021 IEEE 14th International Conference on Software Testing, Verification and Validation, ICST 2021
SP - 241
EP - 252
BT - Proceedings - 2021 IEEE 14th International Conference on Software Testing, Verification and Validation, ICST 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 14th IEEE International Conference on Software Testing, Verification and Validation, ICST 2021
Y2 - 12 April 2021 through 16 April 2021
ER -