TY - GEN
T1 - Modeling MPI communication performance on SMP nodes
T2 - 23rd European MPI Users' Group Meeting, EuroMPI 2016
AU - Gropp, William
AU - Olson, Luke N.
AU - Samfass, Philipp
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/9/25
Y1 - 2016/9/25
N2 - The "postal" model of communication [3, 8] T = a + ßn, for sending n bytes of data between two processes with latency a and bandwidth 1/ß, is perhaps the most commonly used communication performance model in parallel computing. This performance model is often used in developing and evaluating parallel algorithms in high-performance computing, and was an effective model when it was first proposed. Consequently, numerous tests of "ping pong" communication have been developed in order to measure these parameters in the model. However, with the advent of multicore nodes connected to a single (or a few) network interfaces, the model has become a poor match to modern hardware. In this paper, we show a simple three-parameter model that better captures the behavior of current parallel computing systems, and demonstrate its accuracy on several systems. In support of this model, which we call the max-rate model, we have developed an open source benchmark1 that can be used to determine the model parameters.
AB - The "postal" model of communication [3, 8] T = a + ßn, for sending n bytes of data between two processes with latency a and bandwidth 1/ß, is perhaps the most commonly used communication performance model in parallel computing. This performance model is often used in developing and evaluating parallel algorithms in high-performance computing, and was an effective model when it was first proposed. Consequently, numerous tests of "ping pong" communication have been developed in order to measure these parameters in the model. However, with the advent of multicore nodes connected to a single (or a few) network interfaces, the model has become a poor match to modern hardware. In this paper, we show a simple three-parameter model that better captures the behavior of current parallel computing systems, and demonstrate its accuracy on several systems. In support of this model, which we call the max-rate model, we have developed an open source benchmark1 that can be used to determine the model parameters.
KW - Bandwidth saturation
KW - Benchmark
KW - Communication
KW - Multicore
KW - Parallel computing
KW - Performance model
KW - Ping pong
KW - Symmetric multiprocessor cluster
UR - http://www.scopus.com/inward/record.url?scp=84995550157&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84995550157&partnerID=8YFLogxK
U2 - 10.1145/2966884.2966919
DO - 10.1145/2966884.2966919
M3 - Conference contribution
AN - SCOPUS:84995550157
T3 - ACM International Conference Proceeding Series
SP - 41
EP - 50
BT - Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016
PB - Association for Computing Machinery
Y2 - 25 September 2016 through 28 September 2016
ER -