Abstract
One-sided communication in Message Passing Interface (MPI) requires the use of one of three different synchronization mechanisms, which indicate when the one-sided operation can be started and when the operation is completed. Efficient implementation of the synchronization mechanisms is critical to achieving good performance with one-sided communication. However, our performance measurements indicate that in many MPI implementations, the synchronization functions add significant overhead, resulting in one-sided communication performing much worse than point-to-point communication for short- and medium-sized messages. In this paper, we describe our efforts to minimize the overhead of synchronization in our implementation of one-sided communication in MPICH2. We describe our optimizations for all three synchronization mechanisms defined in MPI: fence, post-start-complete-wait, and lock-unlock. Our performance results demonstrate that, for short messages, MPICH2 performs six times faster than LAM for fence synchronization and 50% faster for post-start-complete-wait synchronization, and it performs more than twice as fast as Sun MPI for all three synchronization methods.
Original language | English (US) |
---|---|
Pages (from-to) | 119-128 |
Number of pages | 10 |
Journal | International Journal of High Performance Computing Applications |
Volume | 19 |
Issue number | 2 SPEC. ISS. |
DOIs | |
State | Published - Jun 2005 |
Externally published | Yes |
Keywords
- MPI
- One-sided communication
- Remote-memory access
- Synchronization
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Hardware and Architecture