Redundancy does not imply fault tolerance: Analysis of distributed storage reactions to single errors and corruptions

Aishwarya Ganesan, Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau

Research output: Chapter in Book/Report/Conference proceedingConference contribution


We analyze how modern distributed storage systems behave in the presence of file-system faults such as data corruption and read and write errors. We characterize eight popular distributed storage systems and uncover numerous bugs related to file-system fault tolerance. We find that modern distributed systems do not consistently use redundancy to recover from file-system faults: a single file-system fault can cause catastrophic outcomes such as data loss, corruption, and unavailability. Our results have implications for the design of next generation fault-tolerant distributed and cloud storage systems.

Original languageEnglish (US)
Title of host publicationProceedings of the 15th USENIX Conference on File and Storage Technologies, FAST 2017
PublisherUSENIX Association
Number of pages17
ISBN (Electronic)9781931971362
StatePublished - 2017
Externally publishedYes
Event15th USENIX Conference on File and Storage Technologies, FAST 2017 - Santa Clara, United States
Duration: Feb 27 2017Mar 2 2017


Conference15th USENIX Conference on File and Storage Technologies, FAST 2017
Country/TerritoryUnited States
CitySanta Clara

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software
  • Computer Networks and Communications


Dive into the research topics of 'Redundancy does not imply fault tolerance: Analysis of distributed storage reactions to single errors and corruptions'. Together they form a unique fingerprint.

Cite this