TY - GEN
T1 - An Abstraction for Distributed Stencil Computations Using Charm++
AU - Bhosale, Aditya
AU - Fink, Zane
AU - Kale, Laxmikant
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - Python has emerged as a popular programming language for scientific computing in recent years, thanks to libraries like Numpy and SciPy. Numpy, in particular, is widely utilized for prototyping numerical solvers using methods such as finite difference, finite volume, and multigrid. However, Numpy’s performance is confined to a single node, compelling programmers to resort to a lower-level language for running large-scale simulations. In this paper, we introduce CharmStencil, a high-level abstraction featuring a Numpy-like Python frontend and a highly efficient Charm++ backend. Employing a client-server model, CharmStencil maintains productivity with tools like Jupyter notebooks on the frontend while utilizing a high-performance Charm++ library on the backend for computation. We demonstrate that CharmStencil achieves orders of magnitude better single-threaded performance compared to Numpy and can scale to thousands of CPU cores. Additionally, we showcase superior performance compared to cuNumeric and Numba, popular Python libraries for parallel array computations.
AB - Python has emerged as a popular programming language for scientific computing in recent years, thanks to libraries like Numpy and SciPy. Numpy, in particular, is widely utilized for prototyping numerical solvers using methods such as finite difference, finite volume, and multigrid. However, Numpy’s performance is confined to a single node, compelling programmers to resort to a lower-level language for running large-scale simulations. In this paper, we introduce CharmStencil, a high-level abstraction featuring a Numpy-like Python frontend and a highly efficient Charm++ backend. Employing a client-server model, CharmStencil maintains productivity with tools like Jupyter notebooks on the frontend while utilizing a high-performance Charm++ library on the backend for computation. We demonstrate that CharmStencil achieves orders of magnitude better single-threaded performance compared to Numpy and can scale to thousands of CPU cores. Additionally, we showcase superior performance compared to cuNumeric and Numba, popular Python libraries for parallel array computations.
KW - Charm++
KW - Distributed
KW - Numpy
KW - Python
KW - Stencil
UR - http://www.scopus.com/inward/record.url?scp=85197218750&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85197218750&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-61763-8_12
DO - 10.1007/978-3-031-61763-8_12
M3 - Conference contribution
AN - SCOPUS:85197218750
SN - 9783031617621
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 123
EP - 134
BT - Asynchronous Many-Task Systems and Applications - 2nd International Workshop, WAMTA 2024, Proceedings
A2 - Diehl, Patrick
A2 - Schuchart, Joseph
A2 - Valero-Lara, Pedro
A2 - Bosilca, George
PB - Springer
T2 - 2nd International Workshop on Asynchronous Many-Task Systems and Applications, WAMTA 2024
Y2 - 14 February 2024 through 16 February 2024
ER -