TY - GEN
T1 - Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation
AU - Moses, William S.
AU - Narayanan, Sri Hari Krishna
AU - Paehler, Ludger
AU - Churavy, Valentin
AU - Schanen, Michel
AU - Huckelheim, Jan
AU - Doerfert, Johannes
AU - Hovland, Paul
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Derivatives are key to numerous science, engineering, and machine learning applications. While existing tools generate derivatives of programs in a single language, modern parallel applications combine a set of frameworks and languages to leverage available performance and function in an evolving hardware landscape. We propose a scheme for differentiating arbitrary DAG-based parallelism that preserves scalability and efficiency, implemented into the LLVM-based Enzyme automatic differentiation framework. By integrating with a full-fledged compiler backend, Enzyme can differentiate numerous parallel frameworks and directly control code generation. Combined with its ability to differentiate any LLVM-based language, this flexibility permits Enzyme to leverage the compiler tool chain for parallel and differentiation-specitic optimizations. We differentiate nine distinct versions of the LULESH and miniBUDE applications, written in different programming languages (C++, Julia) and parallel frameworks (OpenMP, MPI, RAJA, Julia tasks, MPI.jl), demonstrating similar scalability to the original program. On benchmarks with 64 threads or nodes, we find a differentiation overhead of 3.4-6.8× on C++ and 5.4-12.5× on Julia.
AB - Derivatives are key to numerous science, engineering, and machine learning applications. While existing tools generate derivatives of programs in a single language, modern parallel applications combine a set of frameworks and languages to leverage available performance and function in an evolving hardware landscape. We propose a scheme for differentiating arbitrary DAG-based parallelism that preserves scalability and efficiency, implemented into the LLVM-based Enzyme automatic differentiation framework. By integrating with a full-fledged compiler backend, Enzyme can differentiate numerous parallel frameworks and directly control code generation. Combined with its ability to differentiate any LLVM-based language, this flexibility permits Enzyme to leverage the compiler tool chain for parallel and differentiation-specitic optimizations. We differentiate nine distinct versions of the LULESH and miniBUDE applications, written in different programming languages (C++, Julia) and parallel frameworks (OpenMP, MPI, RAJA, Julia tasks, MPI.jl), demonstrating similar scalability to the original program. On benchmarks with 64 threads or nodes, we find a differentiation overhead of 3.4-6.8× on C++ and 5.4-12.5× on Julia.
KW - automatic differentiation
KW - C++
KW - compiler
KW - distributed
KW - Enzyme
KW - hybrid parallelization
KW - Julia
KW - LLVM
KW - MPI
KW - OpenMP
KW - parallel
KW - Raja
KW - Tasks
UR - http://www.scopus.com/inward/record.url?scp=85149303815&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85149303815&partnerID=8YFLogxK
U2 - 10.1109/SC41404.2022.00065
DO - 10.1109/SC41404.2022.00065
M3 - Conference contribution
AN - SCOPUS:85149303815
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - Proceedings of SC 2022
PB - IEEE Computer Society
T2 - 2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022
Y2 - 13 November 2022 through 18 November 2022
ER -