Efficient implementation of MPI-3 RMA over openFabrics interfaces

Hajime Fujita, Chongxiao Cao, Sayantan Sur, Charles Archer, Erik Paulson, Maria Garzaran

Research output: Contribution to journalArticlepeer-review

Abstract

The Message Passing Interface (MPI) standard supports Remote Memory Access (RMA) operations, where a process can read or write memory of another process without requiring the target process to be involved in the communication. This enables new more efficient programming models. This paper describes the RMA design and implementation in MPICH-OFI, an MPICH-based open source implementation of the MPI standard that uses the OpenFabrics Interfaces* (OFI*) to communicate with the underlying network fabric. MPICH-OFI is based on a new communication layer called CH4, which was designed to achieve high performance by minimizing the runtime software overhead and by having an internal API that is well aligned with MPI functions. MPICH-OFI uses the OpenFabrics Interfaces (OFI), a lightweight communication framework to support modern high-speed interconnects. Thanks to CH4 and OFI, MPICH-OFI achieves low latency and high bandwidth for RMA operations. Our experimental results using microbenchmarks show that MPICH-OFI achieves more than 3x better put/get latency and bandwidth than MPICH CH3, 10% better latency than Open MPI and MVAPICH2, and more than 1.7x bandwidth than MVAPICH2 for small messages ( ≤ 4KB), on Intel® Omni-Path Architecture.

Original languageEnglish (US)
Pages (from-to)1-10
Number of pages10
JournalParallel Computing
Volume87
DOIs
StatePublished - Sep 2019
Externally publishedYes

Keywords

  • MPI
  • MPICH-OFI
  • Message Passing Interface
  • One-sided Communications
  • OpenFabrics Intefaces (OFI)
  • Remote Memory Access (RMA)

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications
  • Computer Graphics and Computer-Aided Design
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Efficient implementation of MPI-3 RMA over openFabrics interfaces'. Together they form a unique fingerprint.

Cite this