TY - GEN
T1 - Inferring Program Transformations from Singular Examples via Big Code
AU - Jiang, Jiajun
AU - Ren, Luyao
AU - Xiong, Yingfei
AU - Zhang, Lingming
N1 - Funding Information:
This work was partially supported by the National Key Research and Development Program of China under Grant No.2017YFB1001803, National Natural Science Foundation of China under Grant Nos. 61672045 and 61529201, and National Science Foundation under Grant Nos. CCF-1566589 and CCF-1763906, and Amazon. Special thanks should go to Xia Li (UT Dallas) who shared the big code base with us, making it possible to conduct our large-scale evaluation.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - Inferring program transformations from concrete program changes has many potential uses, such as applying systematic program edits, refactoring, and automated program repair. Existing work for inferring program transformations usually rely on statistical information over a potentially large set of program-change examples. However, in many practical scenarios we do not have such a large set of program-change examples. In this paper, we address the challenge of inferring a program transformation from one single example. Our core insight is that 'big code' can provide effective guide for the generalization of a concrete change into a program transformation, i.e., code elements appearing in many files are general and should not be abstracted away. We first propose a framework for transformation inference, where programs are represented as hypergraphs to enable fine-grained generalization of transformations. We then design a transformation inference approach, GENPAT, that infers a program transformation based on code context and statistics from a big code corpus. We have evaluated GENPAT under two distinct application scenarios, systematic editing and program repair. The evaluation on systematic editing shows that GENPAT significantly outperforms a state-of-the-art approach, SYDIT, with up to 5.5x correctly transformed cases. The evaluation on program repair suggests that GENPAT has the potential to be integrated in advanced program repair tools-GENPAT successfully repaired 19 real-world bugs in the Defects4J benchmark by simply applying transformations inferred from existing patches, where 4 bugs have never been repaired by any existing technique. Overall, the evaluation results suggest that GENPAT is effective for transformation inference and can potentially be adopted for many different applications.
AB - Inferring program transformations from concrete program changes has many potential uses, such as applying systematic program edits, refactoring, and automated program repair. Existing work for inferring program transformations usually rely on statistical information over a potentially large set of program-change examples. However, in many practical scenarios we do not have such a large set of program-change examples. In this paper, we address the challenge of inferring a program transformation from one single example. Our core insight is that 'big code' can provide effective guide for the generalization of a concrete change into a program transformation, i.e., code elements appearing in many files are general and should not be abstracted away. We first propose a framework for transformation inference, where programs are represented as hypergraphs to enable fine-grained generalization of transformations. We then design a transformation inference approach, GENPAT, that infers a program transformation based on code context and statistics from a big code corpus. We have evaluated GENPAT under two distinct application scenarios, systematic editing and program repair. The evaluation on systematic editing shows that GENPAT significantly outperforms a state-of-the-art approach, SYDIT, with up to 5.5x correctly transformed cases. The evaluation on program repair suggests that GENPAT has the potential to be integrated in advanced program repair tools-GENPAT successfully repaired 19 real-world bugs in the Defects4J benchmark by simply applying transformations inferred from existing patches, where 4 bugs have never been repaired by any existing technique. Overall, the evaluation results suggest that GENPAT is effective for transformation inference and can potentially be adopted for many different applications.
KW - Code abstraction
KW - Pattern generation
KW - Program adaptation
UR - http://www.scopus.com/inward/record.url?scp=85078902077&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078902077&partnerID=8YFLogxK
U2 - 10.1109/ASE.2019.00033
DO - 10.1109/ASE.2019.00033
M3 - Conference contribution
AN - SCOPUS:85078902077
T3 - Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
SP - 255
EP - 266
BT - Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
Y2 - 10 November 2019 through 15 November 2019
ER -