Verifying IO Synchronization from MPI Traces

Sushma Yellapragada, Chen Wang, Marc Snir

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The paper addresses the following question: Are IO operations of HPC applications properly synchronized? We focus on parallel file systems that satisfy POSIX semantics. The outcome of I/O operations is well-defined provided that conflicting accesses to a file location are not concurrent, but are ordered. Accesses to distinct processes are ordered by the executed MPI communication. We derive the "happens-before"relation between I/O calls of HPC runs by analyzing traces collected during program execution. Various optimizations reduce the analysis overhead. We collected traces from 17 representative HPC applications. We found that 10 of them do not perform conflicting I/O accesses and, hence, are properly synchronized by default. The remaining 7 applications properly synchronize the conflicting I/O accesses.

Original languageEnglish (US)
Title of host publicationProceedings of PDSW 2021
Subtitle of host publicationIEEE/ACM 6th International Parallel Data Systems Workshop, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages41-46
Number of pages6
ISBN (Electronic)9781665418379
DOIs
StatePublished - 2021
Event6th IEEE/ACM International Parallel Data Systems Workshop, PDSW 2021 - St. Louis, United States
Duration: Nov 15 2021 → …

Publication series

NameProceedings of PDSW 2021: IEEE/ACM 6th International Parallel Data Systems Workshop, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference6th IEEE/ACM International Parallel Data Systems Workshop, PDSW 2021
Country/TerritoryUnited States
CitySt. Louis
Period11/15/21 → …

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Verifying IO Synchronization from MPI Traces'. Together they form a unique fingerprint.

Cite this