Optimizing point-to-point communication between adaptive MPI endpoints in shared memory

Research output: Contribution to journalArticlepeer-review

Abstract

Adaptive MPI is an implementation of the MPI standard that supports the virtualization of ranks as user-level threads, rather than OS processes. In this work, we optimize the communication performance of AMPI based on the locality of the endpoints communicating within a cluster of SMP nodes. We differentiate between point-to-point messages with both endpoints co-located on the same execution unit and point-to-point messages with both endpoints residing in the same process but not on the same execution unit. We demonstrate how the messaging semantics of Charm++ enable and hinder AMPI's implementation in different ways, and we motivate extensions to Charm++ to address the limitations. Using the OSU micro-benchmark suite, we show that our locality-aware design offers lower latency, higher bandwidth, and reduced memory footprint for applications.

Original languageEnglish (US)
Article numbere4467
JournalConcurrency and Computation: Practice and Experience
Volume32
Issue number3
DOIs
StatePublished - Feb 10 2020

Keywords

  • AMPI
  • MPI
  • endpoints
  • intra-node communication
  • shared memory optimizations

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Computer Science Applications
  • Computer Networks and Communications
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Optimizing point-to-point communication between adaptive MPI endpoints in shared memory'. Together they form a unique fingerprint.

Cite this