TY - JOUR
T1 - The impacts of techniques, programs and tests on automated program repair
T2 - An empirical study
AU - Kong, Xianglong
AU - Zhang, Lingming
AU - Wong, W. Eric
AU - Li, Bixin
N1 - Funding Information:
We would like to thank W. Weimer, C. Le Goues, and Y. Qi et al. for sharing the source code of GenProg and RSRepair with us. This work is sponsored partially by China Scholarship Council No. 201406090080 , partially by National Natural Science Foundation of China under Grant No. 61572126 , 61402103 and partially by Huawei Innovation Research Program (HIRP) under Grant No. YB2013120195. Xianglong Kong is a joint Ph.D. student of Southeast University and The University of Texas at Dallas. He got his Bachelor’s degree in Computer Science from Southeast University (China) in 2009. He has studied under the supervision of Prof. W. Eric Wong and Lingming Zhang in Department of Computer Science, The University of Texas at Dallas from 2014 to 2016. He is studying under the supervision of Prof. Bixin Li in Software Engineering Institute, Southeast University. Lingming Zhang is an assistant professor of University of Texas at Dallas. He got his Ph.D.’s degree in May 2014 from the Electrical & Computer Engineering Department at The University of Texas at Austin under the supervision of Prof. Sarfraz Khurshid. He received his Master’s degree in Computer Science from Peking University (China) in 2010 under the supervision of Prof. Lu Zhang. Before that, he got his Bachelor’s degree in Computer Science from Nanjing University (China) in 2007. W. Eric Wong received his M.S. and Ph.D. in Computer Science from Purdue University. He is a full professor and the founding director of the Advanced Research Center for Software Testing and Quality Assurance in Computer Science, University of Texas at Dallas (UTD). He also has an appointment as a guest researcher with National Institute of Standards and Technology (NIST), an agency of the US Department of Commerce. Prior to joining UTD, he was with Telcordia Technologies (formerly Bellcore) as a senior research scientist and the project manager in charge of Dependable Telecom Software Development. In 2014, he was named the IEEE Reliability Society Engineer of the Year. His research focuses on helping practitioners improve the quality of software while reducing the cost of production. In particular, he is working on software testing, debugging, risk analysis/metrics, safety, and reliability. He has very strong experience developing real-life industry applications of his research results. Professor Wong is the Editor-in-Chief of IEEE Transactions on Reliability. He is also the Founding Steering Committee Chair of the IEEE International Conference on Software Quality, Reliability, and Security (QRS) and the IEEE International Workshop on Program Debugging (IWPD). Bixin Li is a professor of Computer Science and Engineering School at the Southeast University, Nanjing, China. His research interests include: Program slicing and its application; Software evolution and maintenance; and Software modeling, analysis, testing and verification. He has published over 90 articles in refereed conferences and journals. He leads a Software Engineering Institute in Southeast University, and over 20 young men and women are hard working on national and international projects.
Funding Information:
We would like to thank W. Weimer, C. Le Goues, and Y. Qi et al. for sharing the source code of GenProg and RSRepair with us. This work is sponsored partially by China Scholarship Council No. 201406090080, partially by National Natural Science Foundation of China under Grant No. 61572126, 61402103 and partially by Huawei Innovation Research Program (HIRP) under Grant No. YB2013120195.
Publisher Copyright:
© 2017
PY - 2018/3
Y1 - 2018/3
N2 - Manual program repair is notoriously tedious, error-prone, and costly, especially for the modern large-scale projects. Automated program repair can automatically find program patches without much human intervention, greatly reducing the burden of developers as well as accelerating software delivery. Therefore, much research effort has been dedicated to design powerful program repair techniques. To date, although various program repair techniques have been proposed, to our knowledge, there lacks extensive study on the impacts of repair techniques, subject programs, and test suites on the repair effectiveness and efficiency. In this paper, we perform such an extensive study on repairing 180 seeded and real faults from 17 small to large sized programs. We study the impacts of five representative automated program repair techniques, including GenProg, RSRepair, Brute-force-based technique, AE and Kali, on the repair results. We further investigate the impacts of different subject programs and test suites on effectiveness and efficiency of program repair techniques. Our study demonstrates a number of interesting findings: Brute-force-based technique generates the maximum number of patches but is also the most costly technique, while Kali is the most efficient and has medium effectiveness among the studied techniques; techniques that work well with small programs become too costly or ineffective when applied to large sized programs; since tool-reported patches may overfit the selected test cases, we calculate the false positive rates and find that the influence of failed test cases is much larger than that of passed test cases; finally, surprisingly, all the studied techniques except RSRepair can find more than 80% of successful patches within the first 50% of search space.
AB - Manual program repair is notoriously tedious, error-prone, and costly, especially for the modern large-scale projects. Automated program repair can automatically find program patches without much human intervention, greatly reducing the burden of developers as well as accelerating software delivery. Therefore, much research effort has been dedicated to design powerful program repair techniques. To date, although various program repair techniques have been proposed, to our knowledge, there lacks extensive study on the impacts of repair techniques, subject programs, and test suites on the repair effectiveness and efficiency. In this paper, we perform such an extensive study on repairing 180 seeded and real faults from 17 small to large sized programs. We study the impacts of five representative automated program repair techniques, including GenProg, RSRepair, Brute-force-based technique, AE and Kali, on the repair results. We further investigate the impacts of different subject programs and test suites on effectiveness and efficiency of program repair techniques. Our study demonstrates a number of interesting findings: Brute-force-based technique generates the maximum number of patches but is also the most costly technique, while Kali is the most efficient and has medium effectiveness among the studied techniques; techniques that work well with small programs become too costly or ineffective when applied to large sized programs; since tool-reported patches may overfit the selected test cases, we calculate the false positive rates and find that the influence of failed test cases is much larger than that of passed test cases; finally, surprisingly, all the studied techniques except RSRepair can find more than 80% of successful patches within the first 50% of search space.
KW - Automated program repair
KW - Empirical study
UR - http://www.scopus.com/inward/record.url?scp=85021454419&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85021454419&partnerID=8YFLogxK
U2 - 10.1016/j.jss.2017.06.039
DO - 10.1016/j.jss.2017.06.039
M3 - Article
AN - SCOPUS:85021454419
SN - 0164-1212
VL - 137
SP - 480
EP - 496
JO - Journal of Systems and Software
JF - Journal of Systems and Software
ER -