TY - GEN
T1 - Enabling Transformers to Understand Low-Level Programs
AU - Guo, Zifan Carl
AU - Moses, William S.
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Unlike prior approaches to machine learning, Transformer models can first be trained on a large corpus of unlabeled data with a generic objective and then on a smaller task-specific dataset. This versatility has led to both larger models and datasets. Consequently, Transformers have led to breakthroughs in the field of natural language processing. Generic program optimization presently operates on low-level programs such as LLVM. Unlike the high-level languages (e.g. C, Python, Java), which have seen initial success in machine-learning analyses, lower-level languages tend to be more verbose and repetitive to precisely specify program behavior, provide more details about microarchitecture, and derive properties necessary for optimization, all of which makes it difficult for machine learning. In this work, we apply transfer learning to low-level (LLVM) programs and study how low-level programs can be made more amenable to Transformer models through various techniques, including preprocessing, infix/prefix operators, and information deduplication. We evaluate the effectiveness of these techniques through a series of ablation studies on the task of translating C to both unoptimized (-O0) and optimized (-01) LLVM IR. On the AnghaBench dataset, our model achieves a 49.57% verbatim match and BLEU score of 87.68 against Clang -O0 and 38.73% verbatim match and BLEU score of 77.03 against Clang -O1.
AB - Unlike prior approaches to machine learning, Transformer models can first be trained on a large corpus of unlabeled data with a generic objective and then on a smaller task-specific dataset. This versatility has led to both larger models and datasets. Consequently, Transformers have led to breakthroughs in the field of natural language processing. Generic program optimization presently operates on low-level programs such as LLVM. Unlike the high-level languages (e.g. C, Python, Java), which have seen initial success in machine-learning analyses, lower-level languages tend to be more verbose and repetitive to precisely specify program behavior, provide more details about microarchitecture, and derive properties necessary for optimization, all of which makes it difficult for machine learning. In this work, we apply transfer learning to low-level (LLVM) programs and study how low-level programs can be made more amenable to Transformer models through various techniques, including preprocessing, infix/prefix operators, and information deduplication. We evaluate the effectiveness of these techniques through a series of ablation studies on the task of translating C to both unoptimized (-O0) and optimized (-01) LLVM IR. On the AnghaBench dataset, our model achieves a 49.57% verbatim match and BLEU score of 87.68 against Clang -O0 and 38.73% verbatim match and BLEU score of 77.03 against Clang -O1.
KW - compilers
KW - LLVM
KW - machine learning
KW - machine translation
KW - NLP
UR - http://www.scopus.com/inward/record.url?scp=85142302492&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85142302492&partnerID=8YFLogxK
U2 - 10.1109/HPEC55821.2022.9926313
DO - 10.1109/HPEC55821.2022.9926313
M3 - Conference contribution
AN - SCOPUS:85142302492
T3 - 2022 IEEE High Performance Extreme Computing Conference, HPEC 2022
BT - 2022 IEEE High Performance Extreme Computing Conference, HPEC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE High Performance Extreme Computing Conference, HPEC 2022
Y2 - 19 September 2022 through 23 September 2022
ER -