Collective algorithms for multiported torus networks

Paul Sack, William Gropp

Research output: Contribution to journalArticlepeer-review


Modern supercomputers with torus networks allow each node to simultaneously pass messages on all of its links. However, most collective algorithms are designed to only use one link at a time. In this work, we present novel multiported algorithms for the scatter, gather, all-gather, and reduce-scatter operations. Our algorithms can be combined to create multiported reduce, all-reduce, and broadcast algorithms. Several of these algorithms involve a new technique where we relax the MPI message-ordering constraints to achieve high performance and restore the correct ordering using an additional stage of redundant communication. According to our models, on an n-dimensional torus, our algorithms should allow for nearly a 2n-fold improvement in communication performance compared to known, single-ported torus algorithms. In practice, we have achieved nearly 6x better performance on a 32k-node 3-dimensional torus.

Original languageEnglish (US)
Article numbera12
JournalACM Transactions on Parallel Computing
Issue number2
StatePublished - Jan 2015


  • Collective algorithms
  • Message-passing

ASJC Scopus subject areas

  • Software
  • Modeling and Simulation
  • Hardware and Architecture
  • Computer Science Applications
  • Computational Theory and Mathematics


Dive into the research topics of 'Collective algorithms for multiported torus networks'. Together they form a unique fingerprint.

Cite this