Quantifying network contention on large parallel machines

Abhinav Bhatelé, Laxmikant V. Kalé

Research output: Contribution to journalArticlepeer-review

Abstract

In the early years of parallel computing research, significant theoretical studies were done on interconnect topologies and topology aware mapping for parallel computers. With the deployment of virtual cut-through, wormhole routing and faster interconnects, message latencies reduced and research in the area died down. This article shows that network topology has become important again with the emergence of very large supercomputers, typically connected as a 3D torus or mesh. It presents a quantitative study on the effect of contention on message latencies on torus and mesh networks. Several MPI benchmarks are used to evaluate the effect of hops (links) traversed by messages, on their latencies. The benchmarks demonstrate that when multiple messages compete for network resources, link occupancy or contention can increase message latencies by up to a factor of 8 times on some architectures. Results are shown for three parallel machines ANL's IBM Blue Gene/P (Surveyor), RNL's Cray XT4 (Jaguar) and PSC's Cray XT3 (BigBen). Findings in this article suggest that application developers should now consider interconnect topologies when mapping tasks to processors in order to obtain the best performance on large parallel machines.

Original languageEnglish (US)
Pages (from-to)553-572
Number of pages20
JournalParallel Processing Letters
Volume19
Issue number4
DOIs
StatePublished - Dec 2009

Keywords

  • Contention
  • Link sharing
  • MPI
  • Topology aware mapping
  • Torus interconnects

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Quantifying network contention on large parallel machines'. Together they form a unique fingerprint.

Cite this