TY - GEN
T1 - Verifying IO Synchronization from MPI Traces
AU - Yellapragada, Sushma
AU - Wang, Chen
AU - Snir, Marc
N1 - This work was supported by NSF grant CCF 17-63540.
PY - 2021
Y1 - 2021
N2 - The paper addresses the following question: Are IO operations of HPC applications properly synchronized? We focus on parallel file systems that satisfy POSIX semantics. The outcome of I/O operations is well-defined provided that conflicting accesses to a file location are not concurrent, but are ordered. Accesses to distinct processes are ordered by the executed MPI communication. We derive the "happens-before"relation between I/O calls of HPC runs by analyzing traces collected during program execution. Various optimizations reduce the analysis overhead. We collected traces from 17 representative HPC applications. We found that 10 of them do not perform conflicting I/O accesses and, hence, are properly synchronized by default. The remaining 7 applications properly synchronize the conflicting I/O accesses.
AB - The paper addresses the following question: Are IO operations of HPC applications properly synchronized? We focus on parallel file systems that satisfy POSIX semantics. The outcome of I/O operations is well-defined provided that conflicting accesses to a file location are not concurrent, but are ordered. Accesses to distinct processes are ordered by the executed MPI communication. We derive the "happens-before"relation between I/O calls of HPC runs by analyzing traces collected during program execution. Various optimizations reduce the analysis overhead. We collected traces from 17 representative HPC applications. We found that 10 of them do not perform conflicting I/O accesses and, hence, are properly synchronized by default. The remaining 7 applications properly synchronize the conflicting I/O accesses.
UR - http://www.scopus.com/inward/record.url?scp=85124178904&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85124178904&partnerID=8YFLogxK
U2 - 10.1109/PDSW54622.2021.00012
DO - 10.1109/PDSW54622.2021.00012
M3 - Conference contribution
AN - SCOPUS:85124178904
T3 - Proceedings of PDSW 2021: IEEE/ACM 6th International Parallel Data Systems Workshop, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 41
EP - 46
BT - Proceedings of PDSW 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th IEEE/ACM International Parallel Data Systems Workshop, PDSW 2021
Y2 - 15 November 2021
ER -