TY - GEN
T1 - MPI on a million processors
AU - Balaji, Pavan
AU - Buntinas, Darius
AU - Goodell, David
AU - Gropp, William
AU - Kumar, Sameer
AU - Lusk, Ewing
AU - Thakur, Rajeev
AU - Träff, Jesper Larsson
N1 - Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2009
Y1 - 2009
N2 - Petascale machines with close to a million processors will soon be available. Although MPI is the dominant programming model today, some researchers and users wonder (and perhaps even doubt) whether MPI will scale to such large processor counts. In this paper, we examine this issue of how scalable is MPI. We first examine the MPI specification itself and discuss areas with scalability concerns and how they can be overcome. We then investigate issues that an MPI implementation must address to be scalable. We ran some experiments to measure MPI memory consumption at scale on up to 131,072 processes or 80% of the IBM Blue Gene/P system at Argonne National Laboratory. Based on the results, we tuned the MPI implementation to reduce its memory footprint. We also discuss issues in application algorithmic scalability to large process counts and features of MPI that enable the use of other techniques to overcome scalability limitations in applications.
AB - Petascale machines with close to a million processors will soon be available. Although MPI is the dominant programming model today, some researchers and users wonder (and perhaps even doubt) whether MPI will scale to such large processor counts. In this paper, we examine this issue of how scalable is MPI. We first examine the MPI specification itself and discuss areas with scalability concerns and how they can be overcome. We then investigate issues that an MPI implementation must address to be scalable. We ran some experiments to measure MPI memory consumption at scale on up to 131,072 processes or 80% of the IBM Blue Gene/P system at Argonne National Laboratory. Based on the results, we tuned the MPI implementation to reduce its memory footprint. We also discuss issues in application algorithmic scalability to large process counts and features of MPI that enable the use of other techniques to overcome scalability limitations in applications.
UR - http://www.scopus.com/inward/record.url?scp=70350482696&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70350482696&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-03770-2_9
DO - 10.1007/978-3-642-03770-2_9
M3 - Conference contribution
AN - SCOPUS:70350482696
SN - 3642037690
SN - 9783642037696
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 20
EP - 30
BT - Recent Advances in Parallel Virtual Machine and Message Passing Interface - 16th European PVM/MPI Users' Group Meeting, Proceedings
PB - Springer
T2 - 16th European Parallel Virtual Machine and Message Passing Interface Users' Group Meeting, EuroPVM/MPI
Y2 - 7 September 2009 through 10 September 2009
ER -