Automatic recovery using bounded partially observable Markov decision processes

Kaustubh R. Joshi, Matti A. Hiltunen, William H. Sanders, Richard D. Schlichting

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper provides a technique, based on partially observable Markov decision processes (POMDPs), for building automatic recovery controllers to guide distributed system recovery in a way that provides provable assurances on the quality of the generated recovery actions even when the diagnostic information may be imprecise. Lower bounds on the cost of recovery are introduced and proved, and it is shown how the characteristics of the recovery process can be used to ensure that the lower bounds converge even on undiscounted models. The bounds used in an appropriate online controller provide it with provable termination properties. Simulation-based experimental results on a realistic e-commerce system demonstrate that the proposed bounds can be improved iteratively, and the resulting controller convincingly outperforms a controller that uses heuristics instead of bounds.

Original languageEnglish (US)
Title of host publicationProceedings - DSN 2006
Subtitle of host publication2006 International Conference on Dependable Systems and Networks
Pages445-454
Number of pages10
DOIs
StatePublished - 2006
EventDSN 2006: 2006 International Conference on Dependable Systems and Networks - Philadelphia, PA, United States
Duration: Jun 25 2006Jun 28 2006

Publication series

NameProceedings of the International Conference on Dependable Systems and Networks
Volume2006

Other

OtherDSN 2006: 2006 International Conference on Dependable Systems and Networks
Country/TerritoryUnited States
CityPhiladelphia, PA
Period6/25/066/28/06

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Automatic recovery using bounded partially observable Markov decision processes'. Together they form a unique fingerprint.

Cite this