Can applications recover from fsync failures?

Anthony Rebello, Yuvraj Patel, Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We analyze how file systems and modern data-intensive applications react to fsync failures. First, we characterize how three Linux file systems (ext4, XFS, Btrfs) behave in the presence of failures. We find commonalities across file systems (pages are always marked clean, certain block writes always lead to unavailability), as well as differences (page content and failure reporting is varied). Next, we study how five widely used applications (PostgreSQL, LMDB, LevelDB, SQLite, Redis) handle fsync failures. Our findings show that although applications use many failure-handling strategies, none are sufficient: fsync failures can cause catastrophic outcomes such as data loss and corruption. Our findings have strong implications for the design of file systems and applications that intend to provide strong durability guarantees.

Original languageEnglish (US)
Title of host publicationProceedings of the 2020 USENIX Annual Technical Conference, ATC 2020
PublisherUSENIX Association
Pages753-767
Number of pages15
ISBN (Electronic)9781939133144
StatePublished - 2020
Externally publishedYes
Event2020 USENIX Annual Technical Conference, ATC 2020 - Virtual, Online
Duration: Jul 15 2020Jul 17 2020

Conference

Conference2020 USENIX Annual Technical Conference, ATC 2020
CityVirtual, Online
Period7/15/207/17/20

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Can applications recover from fsync failures?'. Together they form a unique fingerprint.

Cite this