Testing configuration changes in context to prevent production failures

Xudong Sun, Runxiang Cheng, Jianyan Chen, Elaine Ang, Owolabi Legunsen, Tianyin Xu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Large-scale cloud services deploy hundreds of configuration changes to production systems daily. At such velocity, configuration changes have inevitably become prevalent causes of production failures. Existing misconfiguration detection and configuration validation techniques only check configuration values. These techniques cannot detect common types of failure-inducing configuration changes, such as those that cause code to fail or those that violate hidden constraints. We present ctests, a new type of tests for detecting failure-inducing configuration changes to prevent production failures. The idea behind ctests is simple-connecting production system configurations to software tests so that configuration changes can be tested in the context of code affected by the changes. So, ctests can detect configuration changes that expose dormant software bugs and diverse misconfigurations. We show how to generate ctests by transforming the many existing tests in mature systems. The key challenge that we address is the automated identification of test logic and oracles that can be reused in ctests. We generated thousands of ctests from the existing tests in five cloud systems. Our results show that ctests are effective in detecting failure-inducing configuration changes before deployment. We evaluate ctests on real-world failure-inducing configuration changes, injected misconfigurations, and deployed configuration files from public Docker images. Ctests effectively detect real-world failure-inducing configuration changes and misconfigurations in the deployed files.

Original languageEnglish (US)
Title of host publicationProceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020
PublisherUSENIX Association
Pages735-751
Number of pages17
ISBN (Electronic)9781939133199
StatePublished - 2020
Event14th USENIX Symposium on Operating Systems Design and Implementation,OSDI 2020 - Virtual, Online
Duration: Nov 4 2020Nov 6 2020

Publication series

NameProceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020

Conference

Conference14th USENIX Symposium on Operating Systems Design and Implementation,OSDI 2020
CityVirtual, Online
Period11/4/2011/6/20

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems

Fingerprint

Dive into the research topics of 'Testing configuration changes in context to prevent production failures'. Together they form a unique fingerprint.

Cite this