Low-cost error containment and recovery for onboard guarded software upgrading and beyond

Ann T. Tai, Kam S. Tso, Leon Alkalai, Savio N. Chau, William H. Sanders

Research output: Contribution to journalArticlepeer-review

Abstract

Message-driven confidence-driven (MDCD) error containment and recovery, a low-cost approach to mitigating the effect of software design faults in distributed embedded systems, is developed for onboard guarded software upgrading for deep-space missions. In this paper, we first describe and verify the MDCD algorithms in which we introduce the notion of "confidence-driven" to complement the "communication-induced" approach employed by a number of existing checkpointing protocols to achieve error containment and recovery efficiency. We then conduct a model-based analysis to show that the algorithms ensure low performance overhead. Finally, we discuss the advantages of the MDCD approach and its potential utility as a general-purpose, low-cost software fault tolerance technique for distributed embedded computing.

Original languageEnglish (US)
Pages (from-to)121-137
Number of pages17
JournalIEEE Transactions on Computers
Volume51
Issue number2
DOIs
StatePublished - Feb 2002

Keywords

  • Distributed embedded systems
  • Global state consistency and recoverability
  • Guarded software upgrading
  • Message-driven confidence-driven
  • Performance overhead
  • Software fault tolerance

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Low-cost error containment and recovery for onboard guarded software upgrading and beyond'. Together they form a unique fingerprint.

Cite this