Scaling file systems to support petascale clusters: A dependability analysis to support informed design choices

Shravan Gaonkar, Eric Rozier, Anthony Tong, William H. Sanders

Research output: Contribution to conferencePaper

Abstract

Petascale computing requires I/O subsystems that can keep up with the dramatic computing power demanded by such systems. TOP500.org ranks top computers based on their peak compute performance, but there has not been adequate investigation of the current state-of-the-art and future requirements of storage area networks that support petascale computers. Dependable scaling of an I/O subsystem to support petascale computing is not as simple as adding more storage servers. In this paper, we present a stochastic activity network model that uses failure rates computed from real logs to predict the reliability and availability of the storage architecture of the Abe cluster at the National Center for Supercomputing Applications (NCSA). We then use the model to evaluate the challenges encountered as one scales the number of storage servers to support petascale computing. The results present new insights regarding the dependability challenges that will be encountered when building next-generation petabyte storage. Furthermore, we provide insight into a new design approach that will enable system designers to integrate the trace-based analysis of parameter values from real system data into their stochastic models to allow informed design choices.

Original languageEnglish (US)
Pages386-391
Number of pages6
DOIs
StatePublished - Oct 13 2008
Event2008 International Conference on Dependable Systems and Networks, DSN-2008 - Anchorage, AK, United States
Duration: Jun 24 2008Jun 27 2008

Other

Other2008 International Conference on Dependable Systems and Networks, DSN-2008
CountryUnited States
CityAnchorage, AK
Period6/24/086/27/08

Keywords

  • Data analysis
  • Modeling techniques
  • Reliability and availability
  • Simulation
  • Storage systems

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Scaling file systems to support petascale clusters: A dependability analysis to support informed design choices'. Together they form a unique fingerprint.

  • Cite this

    Gaonkar, S., Rozier, E., Tong, A., & Sanders, W. H. (2008). Scaling file systems to support petascale clusters: A dependability analysis to support informed design choices. 386-391. Paper presented at 2008 International Conference on Dependable Systems and Networks, DSN-2008, Anchorage, AK, United States. https://doi.org/10.1109/DSN.2008.4630107