TY - JOUR
T1 - A massively scalable distributed multigrid framework for nonlinear marine hydrodynamics
AU - Glimberg, Stefan Lemvig
AU - Engsig-Karup, Allan Peter
AU - Olson, Luke N.
N1 - Funding Information:
The authors would like to thank Prof Jan Hesthaven, senior researcher Andy Terrel, and CUDA fellow Alan Gray for their support in enabling the massive scalability tests. Part of this research was conducted using computational resources and services at the Center for Computation and Visualization, Brown University. The authors acknowledge the Texas Advanced Computing Center at The University of Texas at Austin for providing HPC resources that have contributed to the research results reported within this article. The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Danish Council for Independent Research?Technology and Production Sciences in Denmark. Nvidia Corporation supported the project with kind Hardware donations through the NVIDIA Academic Partnership Program. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the US Department of Energy under contract no DE-AC05-00OR22725.
PY - 2019/9/1
Y1 - 2019/9/1
N2 - The focus of this article is on the parallel scalability of a distributed multigrid framework, known as the DTU Compute GPUlab Library, for execution on graphics processing unit (GPU)-accelerated supercomputers. We demonstrate near-ideal weak scalability for a high-order fully nonlinear potential flow (FNPF) time domain model on the Oak Ridge Titan supercomputer, which is equipped with a large number of many-core CPU-GPU nodes. The high-order finite difference scheme for the solver is implemented to expose data locality and scalability, and the linear Laplace solver is based on an iterative multilevel preconditioned defect correction method designed for high-throughput processing and massive parallelism. In this work, the FNPF discretization is based on a multi-block discretization that allows for large-scale simulations. In this setup, each grid block is based on a logically structured mesh with support for curvilinear representation of horizontal block boundaries to allow for an accurate representation of geometric features such as surface-piercing bottom-mounted structures—for example, mono-pile foundations as demonstrated. Unprecedented performance and scalability results are presented for a system of equations that is historically known as being too expensive to solve in practical applications. A novel feature of the potential flow model is demonstrated, being that a modest number of multigrid restrictions is sufficient for fast convergence, improving overall parallel scalability as the coarse grid problem diminishes. In the numerical benchmarks presented, we demonstrate using 8192 modern Nvidia GPUs enabling large-scale and high-resolution nonlinear marine hydrodynamics applications.
AB - The focus of this article is on the parallel scalability of a distributed multigrid framework, known as the DTU Compute GPUlab Library, for execution on graphics processing unit (GPU)-accelerated supercomputers. We demonstrate near-ideal weak scalability for a high-order fully nonlinear potential flow (FNPF) time domain model on the Oak Ridge Titan supercomputer, which is equipped with a large number of many-core CPU-GPU nodes. The high-order finite difference scheme for the solver is implemented to expose data locality and scalability, and the linear Laplace solver is based on an iterative multilevel preconditioned defect correction method designed for high-throughput processing and massive parallelism. In this work, the FNPF discretization is based on a multi-block discretization that allows for large-scale simulations. In this setup, each grid block is based on a logically structured mesh with support for curvilinear representation of horizontal block boundaries to allow for an accurate representation of geometric features such as surface-piercing bottom-mounted structures—for example, mono-pile foundations as demonstrated. Unprecedented performance and scalability results are presented for a system of equations that is historically known as being too expensive to solve in practical applications. A novel feature of the potential flow model is demonstrated, being that a modest number of multigrid restrictions is sufficient for fast convergence, improving overall parallel scalability as the coarse grid problem diminishes. In the numerical benchmarks presented, we demonstrate using 8192 modern Nvidia GPUs enabling large-scale and high-resolution nonlinear marine hydrodynamics applications.
KW - High-performance computing
KW - Laplace problem
KW - domain decomposition
KW - free surface water waves
KW - geometric multigrid
KW - heterogeneous computing
KW - multi-GPU
KW - multi-block solver
UR - http://www.scopus.com/inward/record.url?scp=85061189782&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85061189782&partnerID=8YFLogxK
U2 - 10.1177/1094342019826662
DO - 10.1177/1094342019826662
M3 - Article
AN - SCOPUS:85061189782
VL - 33
SP - 855
EP - 868
JO - International Journal of High Performance Computing Applications
JF - International Journal of High Performance Computing Applications
SN - 1094-3420
IS - 5
ER -