Design and implementation of MPICH2 over InfiniBand with RDMA support

Jiuxing Liu, Weihang Jiang, Pete Wyckoff, Dhabaleswar K. Panda, David Ashton, Darius Buntinas, William Gropp, Brian Toonen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

For several years, MPI has been the de facto standard for writing parallel applications. One of the most popular MPI implementations is MPICH. Its successor, MPICH2, features a completely new design that provides more performance and flexibility. To ensure portability, it has a hierarchical structure based on which porting can be done at different levels. In this paper, we present our experiences in designing and implementing MPICH2 over InfiniBand, Because of its high performance and open standard, InfiniBand is gaining popularity in the area of high-performance computing. Our study focuses on optimizing the performance of MPI-1 functions in MPICH2. One of our objectives is to exploit Remote Direct Memory Access (RDMA) in InfiniBand to achieve high performance. We have based our design on the. RDMA Channel interface provided by MPICH2, which encapsulates architecture-dependent communication functionalities into a very small set of functions. Starting with a basic design, we apply different optimizations and also propose a zero-copy-based design. We characterize the impact of our optimizations and designs using microbenchmarks. We have also performed an application-level evaluation using the NAS Parallel Benchmarks. Our optimized MPICH2 implementation achieves 7.6 μs latency and 857 MB/S bandwidth, which are close to the raw performance of the underlying InfiniBand layer. Our study shows that the RDMA Channel interface in MPICH2 provides a simple, yet powerful, abstraction that enables implemen tations with high performance by exploiting RDMA operations in InfiniBand. To the best of our knowledge, this is the first high-performance design and implementation of MPICH2 on InfiniBand using RDMA support.

Original languageEnglish (US)
Title of host publicationProceedings - 18th International Parallel and Distributed Processing Symposium, IPDPS 2004 (Abstracts and CD-ROM)
Pages223-232
Number of pages10
StatePublished - 2004
Externally publishedYes
EventProceedings - 18th International Parallel and Distributed Processing Symposium, IPDPS 2004 (Abstracts and CD-ROM) - Santa Fe, NM, United States
Duration: Apr 26 2004Apr 30 2004

Publication series

NameProceedings - International Parallel and Distributed Processing Symposium, IPDPS 2004 (Abstracts and CD-ROM)
Volume18

Other

OtherProceedings - 18th International Parallel and Distributed Processing Symposium, IPDPS 2004 (Abstracts and CD-ROM)
Country/TerritoryUnited States
CitySanta Fe, NM
Period4/26/044/30/04

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint

Dive into the research topics of 'Design and implementation of MPICH2 over InfiniBand with RDMA support'. Together they form a unique fingerprint.

Cite this