Faster topology-aware collective algorithms through non-minimal communication

Paul Sack, William Gropp

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Known algorithms for two important collective communication operations, allgather and reduce-scatter, are minimal-communication algorithms; no process sends or receives more than the minimum amount of data. This, combined with the data-ordering semantics of the operations, limits the flexibility and performance of these algorithms. Our novel non-minimal, topology-aware algorithms deliver far better performance with the addition of a very small amount of redundant communication. We develop novel algorithms for Clos networks and single or multi-ported torus networks. Tests on a 32k-node BlueGene/P result in allgather speedups of up to 6x and reduce-scatter speedups of over 11x compared to the native IBM algorithm. Broadcast, reduce, and allreduce can be composed of allgather or reduce-scatter and other collective operations; our techniques also improve the performance of these algorithms.

Original languageEnglish (US)
Title of host publicationPPoPP'12 - Proceedings of the 2012 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Pages45-54
Number of pages10
DOIs
StatePublished - 2012
Event17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP'12 - New Orleans, LA, United States
Duration: Feb 25 2012Feb 29 2012

Publication series

NameProceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP

Other

Other17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP'12
Country/TerritoryUnited States
CityNew Orleans, LA
Period2/25/122/29/12

Keywords

  • Collective-communication algorithms

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Faster topology-aware collective algorithms through non-minimal communication'. Together they form a unique fingerprint.

Cite this