TY - GEN
T1 - Accelerating Scientific Applications on Heterogeneous Systems with HybridOMP
AU - Diener, Matthias
AU - Bodony, Daniel J.
AU - Kale, Laxmikant
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - High Performance Computing relies on accelerators (such as GPGPUs) to achieve fast execution of scientific applications. Traditionally these accelerators have been programmed with specialized languages, such as CUDA or OpenCL. In recent years, OpenMP emerged as a promising alternative for supporting accelerators, providing advantages such as maintaining a single code base for the host and different accelerator types and providing a simple way to extend support for accelerators to existing code. Efficiently using this support requires solving several challenges, related to performance, work partitioning, and concurrent execution on multiple device types. In this paper, we discuss these challenges and introduce a library, HybridOMP, that addresses several of them, thus enabling the effective use of OpenMP for accelerators. We apply HybridOMP to a scientific application, PlasCom2, that has not previously been able to use accelerators. Experiments on three architectures show that HybridOMP results in performance gains of up to 10x compared to CPU-only execution. Concurrent execution on the host and GPU resulted in additional gains of up to 10% compared to running on the GPU only.
AB - High Performance Computing relies on accelerators (such as GPGPUs) to achieve fast execution of scientific applications. Traditionally these accelerators have been programmed with specialized languages, such as CUDA or OpenCL. In recent years, OpenMP emerged as a promising alternative for supporting accelerators, providing advantages such as maintaining a single code base for the host and different accelerator types and providing a simple way to extend support for accelerators to existing code. Efficiently using this support requires solving several challenges, related to performance, work partitioning, and concurrent execution on multiple device types. In this paper, we discuss these challenges and introduce a library, HybridOMP, that addresses several of them, thus enabling the effective use of OpenMP for accelerators. We apply HybridOMP to a scientific application, PlasCom2, that has not previously been able to use accelerators. Experiments on three architectures show that HybridOMP results in performance gains of up to 10x compared to CPU-only execution. Concurrent execution on the host and GPU resulted in additional gains of up to 10% compared to running on the GPU only.
KW - Accelerators
KW - GPGPU
KW - Heterogeneous computing
KW - OpenMP
UR - http://www.scopus.com/inward/record.url?scp=85064640353&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85064640353&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-15996-2_13
DO - 10.1007/978-3-030-15996-2_13
M3 - Conference contribution
AN - SCOPUS:85064640353
SN - 9783030159955
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 174
EP - 187
BT - High Performance Computing for Computational Science – VECPAR 2018 - 13th International Conference, Revised Selected Papers
A2 - Senger, Hermes
A2 - Marques, Osni
A2 - de Brito, Tatiana Pinheiro
A2 - Iope, Rogério
A2 - Stanzani, Silvio
A2 - Gil-Costa, Veronica
A2 - Garcia, Rogerio
PB - Springer
T2 - 13th International Conference on High Performance Computing in Computational Science, VECPAR 2018
Y2 - 17 September 2018 through 19 September 2018
ER -