Abstract

The cost of data movement on parallel systems varies greatly with machine architecture, job partition, and nearby jobs. Performance models that accurately capture the cost of data movement provide a tool for analysis, allowing for communication bottlenecks to be pinpointed. Modern heterogeneous architectures yield increased variance in data movement as there are a number of viable paths for inter-GPU communication. In this paper, we present performance models for the various paths of inter-node communication on modern heterogeneous architectures, including the trade-off between GPUDirect communication and copying to CPUs. Furthermore, we present a novel optimization for inter-node communication based on these models, utilizing all available CPU cores per node. Finally, we show associated performance improvements for MPI collective operations.

Original languageEnglish (US)
Title of host publication2021 IEEE High Performance Extreme Computing Conference, HPEC 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665423694
DOIs
StatePublished - 2021
Event2021 IEEE High Performance Extreme Computing Conference, HPEC 2021 - Virtual, Online, United States
Duration: Sep 20 2021Sep 24 2021

Publication series

Name2021 IEEE High Performance Extreme Computing Conference, HPEC 2021

Conference

Conference2021 IEEE High Performance Extreme Computing Conference, HPEC 2021
Country/TerritoryUnited States
CityVirtual, Online
Period9/20/219/24/21

Keywords

  • CUDA-Aware
  • GPU
  • GPUDirect
  • MPI
  • data movement
  • performance modeling

ASJC Scopus subject areas

  • Modeling and Simulation
  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Modeling Data Movement Performance on Heterogeneous Architectures'. Together they form a unique fingerprint.

Cite this