Abstract
Two complementary techniques have evolved for providing fault-tolerance in software: Forward error recovery and backward error recovery. Few implementations permit both approaches to be combined within a particular application. Fewer techniques are available for the construction of fault-tolerant software for systems involving concurrent processes and multiple processors. Many schemes for supporting forward or backward recovery are based on some concept of an atomic action. In this paper, we propose a mechanism for supporting an atomic action in a system of Communicating Sequential Processes (CSP). The atomic action is used as the basic unit for providing fault-tolerance. The atomic action is called an FT-Action, and both forward and backward error recovery are performed in the context of an FT-Action. An implementation for the FT-Action is proposed, which employs a distributed control, uses CSP primitives, and supports local compile and runtime checking of the forward and backward error recovery schemes.
Original language | English (US) |
---|---|
Pages (from-to) | 59-68 |
Number of pages | 10 |
Journal | IEEE Transactions on Software Engineering |
Volume | SE-12 |
Issue number | 1 |
DOIs | |
State | Published - Jan 1986 |
Keywords
- Atomic actions
- backward recovery
- communicating sequential processes
- forward recovery
- software fault-tolerance
ASJC Scopus subject areas
- Software