TY - GEN
T1 - Language-parametric compiler validation with application to LLVM
AU - Kasampalis, Theodoros
AU - Park, Daejun
AU - Lin, Zhengyao
AU - Adve, Vikram S.
AU - Rosu, Grigore
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/4/19
Y1 - 2021/4/19
N2 - We propose a new design for a Translation Validation (TV) system geared towards practical use with modern optimizing compilers, such as LLVM. Unlike existing TV systems, which are custom-tailored for a particular sequence of transformations and a specific, common language for input and output programs, our design clearly separates the transformation-specific components from the rest of the system, and generalizes the transformation-independent components. Specifically, we present Keq, the first program equivalence checker that is parametric to the input and output language semantics and has no dependence on the transformation between the input and output programs. The Keq algorithm is based on a rigorous formalization, namely cut-bisimulation, and is proven correct. We have prototyped a TV system for the Instruction Selection pass of LLVM, being able to automatically prove equivalence for translations from LLVM IR to the MachineIR used in compiling to x86-64. This transformation uses different input and output languages, and as such has not been previously addressed by the state of the art. An experimental evaluation shows that Keq successfully proves correct the translation of over 90% of 4732 supported functions in GCC from SPEC 2006.
AB - We propose a new design for a Translation Validation (TV) system geared towards practical use with modern optimizing compilers, such as LLVM. Unlike existing TV systems, which are custom-tailored for a particular sequence of transformations and a specific, common language for input and output programs, our design clearly separates the transformation-specific components from the rest of the system, and generalizes the transformation-independent components. Specifically, we present Keq, the first program equivalence checker that is parametric to the input and output language semantics and has no dependence on the transformation between the input and output programs. The Keq algorithm is based on a rigorous formalization, namely cut-bisimulation, and is proven correct. We have prototyped a TV system for the Instruction Selection pass of LLVM, being able to automatically prove equivalence for translations from LLVM IR to the MachineIR used in compiling to x86-64. This transformation uses different input and output languages, and as such has not been previously addressed by the state of the art. An experimental evaluation shows that Keq successfully proves correct the translation of over 90% of 4732 supported functions in GCC from SPEC 2006.
KW - Compilers
KW - Program Equivalence
KW - Simulation
KW - Translation Validation
UR - http://www.scopus.com/inward/record.url?scp=85104708517&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85104708517&partnerID=8YFLogxK
U2 - 10.1145/3445814.3446751
DO - 10.1145/3445814.3446751
M3 - Conference contribution
AN - SCOPUS:85104708517
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 1004
EP - 1019
BT - Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021
PB - Association for Computing Machinery
T2 - 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021
Y2 - 19 April 2021 through 23 April 2021
ER -