Efficient structured data access in parallel file systems

Avery Ching, Alok Choudhary, Wei Keng Liao, Robert Ross, William Gropp

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Parallel scientific applications store and retrieve very large, structured datasets. Directly supporting these structured accesses is an important step in providing high-performance I/O solutions for these applications. High-level interfaces such as HDF5 and Parallel netCDF provide convenient APIs for accessing structured datasets, and the MPI-IO interface also supports efficient access to structured data. However, parallel file systems do not traditionally support such access. In this work we present an implementation of structured data access support in the context of the Parallel Virtual File System (PVFS). We call this support "datatype I/O" because of its similarity to MPI datatypes. This support is built by using a reusable datatype-processing component from the MPICH2 MPI implementation. We describe how this component is leveraged to efficiently process structured data representations resulting from MPI-IO operations. We quantitatively assess the solution using three test applications. We also point to further optimizations in the processing path that could be leveraged for even more efficient operation.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE International Conference on Cluster Computing, CLUSTER 2003
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages326-335
Number of pages10
ISBN (Electronic)0769520669
DOIs
StatePublished - 2003
Externally publishedYes
EventIEEE International Conference on Cluster Computing, CLUSTER 2003 - Hong Kong, China
Duration: Dec 1 2003Dec 4 2003

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
Volume2003-January
ISSN (Print)1552-5244

Other

OtherIEEE International Conference on Cluster Computing, CLUSTER 2003
CountryChina
CityHong Kong
Period12/1/0312/4/03

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Signal Processing

Fingerprint Dive into the research topics of 'Efficient structured data access in parallel file systems'. Together they form a unique fingerprint.

Cite this