TY - JOUR
T1 - Managing code transformations for better performance portability
AU - Teixeira, Thiago S.F.X.
AU - Gropp, William
AU - Padua, David
N1 - The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This material is based in part upon work supported by the Department of Energy, National Nuclear Security Administration, under Award Number DE-NA0002374 and by the National Science Foundation under Award Number 1533912. This work utilizes resources supported by the National Science Foundation’s Major Research Instrumentation program, grant #1725729, as well as the University of Illinois at Urbana-Champaign.
PY - 2019/11/1
Y1 - 2019/11/1
N2 - Code optimization is an intricate task that is getting more complex as computing systems evolve. Managing the program optimization process, including the implementation and evaluation of code variants, is tedious, inefficient, and errors are likely to be introduced in the process. Moreover, because each platform typically requires a different sequence of transformations to fully harness its computing power, the optimization process complexity grows as new platforms are adopted. To address these issues, systems and frameworks have been proposed to automate the code optimization process. They, however, have not been widely adopted and are primarily used by experts with deep knowledge about underlying architecture and compiler intricacies. This article describes the requirements that we believe necessary for making automatic performance tuning more broadly used, especially in complex, long-lived high-performance computing applications. Besides discussing limitations of current systems and strategies to overcome these, we describe the design of a system that is able to semi-automatically generate efficient platform-specific code. In the proposed system, the code optimization is programmer-guided, separately from application code, on an external file in what we call optimization programming. The language to program the optimization process is able to represent complex collections of transformations and, as a result, generate efficient platform-specific code. A database manages different optimized versions of code regions, providing a pragmatic approach to performance portability, and the framework itself has separate components, allowing the optimized code to be used on systems without installing all of the modules required for the code generation. We present experiments on two different platforms to illustrate the generation of efficient platform-specific code that performs comparable to hand-optimized, vendor-provided code.
AB - Code optimization is an intricate task that is getting more complex as computing systems evolve. Managing the program optimization process, including the implementation and evaluation of code variants, is tedious, inefficient, and errors are likely to be introduced in the process. Moreover, because each platform typically requires a different sequence of transformations to fully harness its computing power, the optimization process complexity grows as new platforms are adopted. To address these issues, systems and frameworks have been proposed to automate the code optimization process. They, however, have not been widely adopted and are primarily used by experts with deep knowledge about underlying architecture and compiler intricacies. This article describes the requirements that we believe necessary for making automatic performance tuning more broadly used, especially in complex, long-lived high-performance computing applications. Besides discussing limitations of current systems and strategies to overcome these, we describe the design of a system that is able to semi-automatically generate efficient platform-specific code. In the proposed system, the code optimization is programmer-guided, separately from application code, on an external file in what we call optimization programming. The language to program the optimization process is able to represent complex collections of transformations and, as a result, generate efficient platform-specific code. A database manages different optimized versions of code regions, providing a pragmatic approach to performance portability, and the framework itself has separate components, allowing the optimized code to be used on systems without installing all of the modules required for the code generation. We present experiments on two different platforms to illustrate the generation of efficient platform-specific code that performs comparable to hand-optimized, vendor-provided code.
KW - Code generation
KW - compilers
KW - domain-specific language
KW - high-performance computing
KW - optimization
UR - http://www.scopus.com/inward/record.url?scp=85071011803&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071011803&partnerID=8YFLogxK
U2 - 10.1177/1094342019865606
DO - 10.1177/1094342019865606
M3 - Article
AN - SCOPUS:85071011803
SN - 1094-3420
VL - 33
SP - 1290
EP - 1306
JO - International Journal of High Performance Computing Applications
JF - International Journal of High Performance Computing Applications
IS - 6
ER -