TY - GEN
T1 - Node-Aware Improvements to Allreduce
AU - Bienz, Amanda
AU - Olson, Luke
AU - Gropp, William
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - The MPI-Allreduce collective operation is a core kernel of many parallel codebases, particularly for reductions over a single value per process. The commonly used allreduce recursive-doubling algorithm obtains the lower bound message count, yielding optimality for small reduction sizes based on node-agnostic performance models. However, this algorithm yields duplicate messages between sets of nodes. Node-aware optimizations in MPICH remove duplicate messages through use of a single master process per node, yielding a large number of inactive processes at each inter-node step. In this paper, we present an algorithm that uses the multiple processes available per node to reduce the maximum number of inter-node messages communicated by a single process, improving the performance of allreduce operations, particularly for small message sizes.
AB - The MPI-Allreduce collective operation is a core kernel of many parallel codebases, particularly for reductions over a single value per process. The commonly used allreduce recursive-doubling algorithm obtains the lower bound message count, yielding optimality for small reduction sizes based on node-agnostic performance models. However, this algorithm yields duplicate messages between sets of nodes. Node-aware optimizations in MPICH remove duplicate messages through use of a single master process per node, yielding a large number of inactive processes at each inter-node step. In this paper, we present an algorithm that uses the multiple processes available per node to reduce the maximum number of inter-node messages communicated by a single process, improving the performance of allreduce operations, particularly for small message sizes.
UR - http://www.scopus.com/inward/record.url?scp=85079270590&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85079270590&partnerID=8YFLogxK
U2 - 10.1109/ExaMPI49596.2019.00008
DO - 10.1109/ExaMPI49596.2019.00008
M3 - Conference contribution
AN - SCOPUS:85079270590
T3 - Proceedings of ExaMPI 2019: Workshop on Exascale MPI - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 19
EP - 28
BT - Proceedings of ExaMPI 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE/ACM Workshop on Exascale MPI, ExaMPI 2019
Y2 - 17 November 2019
ER -