TY - JOUR
T1 - Self-consistent MPI performance guidelines
AU - Larsson Träff, Jesper
AU - Gropp, William D.
AU - Thakur, Rajeev
N1 - Funding Information:
The ideas expounded in this paper were first introduced in [35], although in a less finished form. The authors thank a number of colleagues for, sometimes, intense discussions on these. At various stages, the paper has benefited significantly from the comments of anonymous reviewers, whose insightfulness, time, and effort they also hereby gratefully acknowledge. The work was supported in part by the Mathematical, Information, and Computational Sciences Division subprogram of the Office of Advanced Scientific Computing Research, Office of Science, US Department of Energy, under Contract DE-AC02-06CH11357. J.L. Träff was with NEC Laboratories Europe while this work was carried out.
PY - 2010
Y1 - 2010
N2 - Message passing using the Message-Passing Interface (MPI) is at present the most widely adopted framework for programming parallel applications for distributed memory and clustered parallel systems. For reasons of (universal) implementability, the MPI standard does not state any specific performance guarantees, but users expect MPI implementations to deliver good and consistent performance in the sense of efficient utilization of the underlying parallel (communication) system. For performance portability reasons, users also naturally desire communication optimizations performed on one parallel platform with one MPI implementation to be preserved when switching to another MPI implementation on another platform. We address the problem of ensuring performance consistency and portability by formulating performance guidelines and conditions that are desirable for good MPI implementations to fulfill. Instead of prescribing a specific performance model (which may be realistic on some systems, under some MPI protocol and algorithm assumptions, etc.), we formulate these guidelines by relating the performance of various aspects of the semantically strongly interrelated MPI standard to each other. Common-sense expectations, for instance, suggest that no MPI function should perform worse than a combination of other MPI functions that implement the same functionality, no specialized function should perform worse than a more general function that can implement the same functionality, no function with weak semantic guarantees should perform worse than a similar function with stronger semantics, and so on. Such guidelines may enable implementers to provide higher quality MPI implementations, minimize performance surprises, and eliminate the need for users to make special, nonportable optimizations by hand. We introduce and semiformalize the concept of self-consistent performance guidelines for MPI, and provide a (nonexhaustive) set of such guidelines in a form that could be automatically verified by benchmarks and experiment management tools. We present experimental results that show cases where guidelines are not satisfied in common MPI implementations, thereby indicating room for improvement in today's MPI implementations.
AB - Message passing using the Message-Passing Interface (MPI) is at present the most widely adopted framework for programming parallel applications for distributed memory and clustered parallel systems. For reasons of (universal) implementability, the MPI standard does not state any specific performance guarantees, but users expect MPI implementations to deliver good and consistent performance in the sense of efficient utilization of the underlying parallel (communication) system. For performance portability reasons, users also naturally desire communication optimizations performed on one parallel platform with one MPI implementation to be preserved when switching to another MPI implementation on another platform. We address the problem of ensuring performance consistency and portability by formulating performance guidelines and conditions that are desirable for good MPI implementations to fulfill. Instead of prescribing a specific performance model (which may be realistic on some systems, under some MPI protocol and algorithm assumptions, etc.), we formulate these guidelines by relating the performance of various aspects of the semantically strongly interrelated MPI standard to each other. Common-sense expectations, for instance, suggest that no MPI function should perform worse than a combination of other MPI functions that implement the same functionality, no specialized function should perform worse than a more general function that can implement the same functionality, no function with weak semantic guarantees should perform worse than a similar function with stronger semantics, and so on. Such guidelines may enable implementers to provide higher quality MPI implementations, minimize performance surprises, and eliminate the need for users to make special, nonportable optimizations by hand. We introduce and semiformalize the concept of self-consistent performance guidelines for MPI, and provide a (nonexhaustive) set of such guidelines in a form that could be automatically verified by benchmarks and experiment management tools. We present experimental results that show cases where guidelines are not satisfied in common MPI implementations, thereby indicating room for improvement in today's MPI implementations.
KW - MPI
KW - Message passing
KW - Message-passing interface
KW - Parallel processing
KW - Performance guidelines
KW - Performance model
KW - Performance portability
KW - Performance prediction
KW - Public benchmarking
UR - http://www.scopus.com/inward/record.url?scp=77950627571&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77950627571&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2009.120
DO - 10.1109/TPDS.2009.120
M3 - Article
AN - SCOPUS:77950627571
SN - 1045-9219
VL - 21
SP - 698
EP - 709
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 5
M1 - 5184825
ER -