Power management of extreme-scale networks with on/off links in runtime systems

Ehsan Totoni, Nikhil Jain, Laxmikant V. Kale

Research output: Contribution to journalArticlepeer-review

Abstract

Networks are among major power consumers in large-scale parallel systems. During execution of common parallel applications, a sizeable fraction of the links in the high-radix interconnects are either never used or are underutilized. We propose a runtime system based adaptive approach to turn off unused links, which has various advantages over the previously proposed hardware and compiler based approaches. We discuss why the runtime system is the best system component to accomplish this task, and test the effectiveness of our approach using real applications (including NAMD, MILC), and application benchmarks (including NAS Parallel Benchmarks, Stencil). These codes are simulated on representative topologies such as 6-D Torus and multilevel directly connected network (similar to IBM PERCS in Power 775 and Dragonfly in Cray Aries). For common applications with near-neighbor communication pattern, our approach can save up to 20% of total machine's power and energy, without any performance penalty.

Original languageEnglish (US)
Article numbera16
JournalACM Transactions on Parallel Computing
Volume1
Issue number2
DOIs
StatePublished - Jan 2015

Keywords

  • Algorithms
  • Design
  • Measurement
  • Performance

ASJC Scopus subject areas

  • Software
  • Modeling and Simulation
  • Hardware and Architecture
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Power management of extreme-scale networks with on/off links in runtime systems'. Together they form a unique fingerprint.

Cite this