Push-Button Reliability Testing for Cloud-Backed Applications with Rainmaker

Yinfang Chen, Xudong Sun, Suman Nath, Ze Yang, Tianyin Xu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Modern applications have been emerging towards a cloud-based programming model where applications depend on cloud services for various functionalities. Such “cloud native” practice greatly simplifies application deployment and realizes cloud benefits (e.g., availability). Meanwhile, it imposes emerging reliability challenges for addressing fault models of the opaque cloud and less predictable Internet connections. In this paper, we discuss these reliability challenges. We develop a taxonomy of bugs that render cloud-backed applications vulnerable to common transient faults. We show that (mis)handling transient error(s) of even one REST call interaction can adversely affect application correctness. We take a first step to address the challenges by building a “push-button” reliability testing tool named Rainmaker, as a basic SDK utility for any cloud-backed application. Rainmaker helps developers anticipate the myriad of errors under the cloud-based fault model, without a need to write new policies, oracles, or test cases. Rainmaker directly works with existing test suites and is a plug-and-play tool for existing test environments. Rainmaker injects faults in the interactions between the application and cloud services. It does so at the REST layer, and thus is transparent to applications under test. More importantly, it encodes automatic fault injection policies to cover the various taxonomized bug patterns, and automatic oracles that embrace existing in-house software tests. To date, Rainmaker has detected 73 bugs (55 confirmed and 51 fixed) in 11 popular cloud-backed applications.

Original languageEnglish (US)
Title of host publicationProceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2023
PublisherUSENIX Association
Pages1701-1716
Number of pages16
ISBN (Electronic)9781939133335
StatePublished - 2023
Event20th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2023 - Boston, United States
Duration: Apr 17 2023Apr 19 2023

Publication series

NameProceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2023

Conference

Conference20th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2023
Country/TerritoryUnited States
CityBoston
Period4/17/234/19/23

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'Push-Button Reliability Testing for Cloud-Backed Applications with Rainmaker'. Together they form a unique fingerprint.

Cite this