Abstract
Message-driven confidence-driven (MDCD) error containment and recovery, a low-cost approach to mitigating the effect of software design faults in distributed embedded systems, is developed for onboard guarded software upgrading for deep-space missions. In this paper, we first describe and verify the MDCD algorithms in which we introduce the notion of "confidence-driven" to complement the "communication-induced" approach employed by a number of existing checkpointing protocols to achieve error containment and recovery efficiency. We then conduct a model-based analysis to show that the algorithms ensure low performance overhead. Finally, we discuss the advantages of the MDCD approach and its potential utility as a general-purpose, low-cost software fault tolerance technique for distributed embedded computing.
Original language | English (US) |
---|---|
Pages (from-to) | 121-137 |
Number of pages | 17 |
Journal | IEEE Transactions on Computers |
Volume | 51 |
Issue number | 2 |
DOIs | |
State | Published - Feb 2002 |
Keywords
- Distributed embedded systems
- Global state consistency and recoverability
- Guarded software upgrading
- Message-driven confidence-driven
- Performance overhead
- Software fault tolerance
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Hardware and Architecture
- Computational Theory and Mathematics