TY - GEN
T1 - Evaluation of parallel communication models in Nekbone, a Nek5000 mini-application
AU - Ivanov, Ilya
AU - Gong, Jing
AU - Akhmetova, Dana
AU - Peng, Ivy Bo
AU - Markidis, Stefano
AU - Laure, Erwin
AU - Machado, Rui
AU - Rahn, Mirko
AU - Bartsch, Valeria
AU - Hart, Alistair
AU - Fischer, Paul
PY - 2015/10/26
Y1 - 2015/10/26
N2 - Nekbone is a proxy application of Nek5000, a scalable Computational Fluid Dynamics (CFD) code used for modelling incompressible flows. The Nekbone mini-application is used by several international co-design centers to explore new concepts in computer science and to evaluate their performance. We present the design and implementation of a new communication kernel in the Nekbone mini-application with the goal of studying the performance of different parallel communication models. First, a new MPI blocking communication kernel has been developed to solve Nekbone problems in a three-dimensional Cartesian mesh and process topology. The new MPI implementation delivers a 13% performance improvement compared to the original implementation. The new MPI communication kernel consists of approximately 500 lines of code against the original 7,000 lines of code, allowing experimentation with new approaches in Nekbone parallel communication. Second, the MPI blocking communication in the new kernel was changed to the MPI non-blocking communication. Third, we developed a new Partitioned Global Address Space (PGAS) communication kernel, based on the GPI-2 library. This approach reduces the synchronization among neighbor processes and is on average 3% faster than the new MPI-based, non-blocking, approach. In our tests on 8,192 processes, the GPI-2 communication kernel is 3% faster than the new MPI non-blocking communication kernel. In addition, we have used the OpenMP in all the versions of the new communication kernel. Finally, we highlight the future steps for using the new communication kernel in the parent application Nek5000.
AB - Nekbone is a proxy application of Nek5000, a scalable Computational Fluid Dynamics (CFD) code used for modelling incompressible flows. The Nekbone mini-application is used by several international co-design centers to explore new concepts in computer science and to evaluate their performance. We present the design and implementation of a new communication kernel in the Nekbone mini-application with the goal of studying the performance of different parallel communication models. First, a new MPI blocking communication kernel has been developed to solve Nekbone problems in a three-dimensional Cartesian mesh and process topology. The new MPI implementation delivers a 13% performance improvement compared to the original implementation. The new MPI communication kernel consists of approximately 500 lines of code against the original 7,000 lines of code, allowing experimentation with new approaches in Nekbone parallel communication. Second, the MPI blocking communication in the new kernel was changed to the MPI non-blocking communication. Third, we developed a new Partitioned Global Address Space (PGAS) communication kernel, based on the GPI-2 library. This approach reduces the synchronization among neighbor processes and is on average 3% faster than the new MPI-based, non-blocking, approach. In our tests on 8,192 processes, the GPI-2 communication kernel is 3% faster than the new MPI non-blocking communication kernel. In addition, we have used the OpenMP in all the versions of the new communication kernel. Finally, we highlight the future steps for using the new communication kernel in the parent application Nek5000.
KW - Communication kernel
KW - GPI-2
KW - MPI
KW - Nek5000
KW - Nekbone
KW - PGAS
UR - http://www.scopus.com/inward/record.url?scp=84959298440&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959298440&partnerID=8YFLogxK
U2 - 10.1109/CLUSTER.2015.131
DO - 10.1109/CLUSTER.2015.131
M3 - Conference contribution
AN - SCOPUS:84959298440
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
SP - 760
EP - 767
BT - Proceedings - 2015 IEEE International Conference on Cluster Computing, CLUSTER 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - IEEE International Conference on Cluster Computing, CLUSTER 2015
Y2 - 8 September 2015 through 11 September 2015
ER -