System level diagnosis: Combining detection and location

Nitin H. Vaidya, Dhiraj K. Pradhan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The problem of system recovery from a large number of faults is addressed. Correlated transient upsets can corrupt the state of a large number of nodes (subsystems). In such a condition, locating faulty nodes can be difficult due to the large number of periodic tests that may have to be carried out. A new approach to system level diagnostics that combines fault detection and location and can detect the fault condition in the event of large number of faults is proposed. Detection allows alternate techniques of diagnosis or at the very least a safe shut-down. This approach is termed safe diagnosis as it provides a measure of safety for critical systems. It is demonstrated that safe diagnosis can be achieved with a small incremental cost. Results that characterize systems that admit a specified level of safe diagnosis are included. Diagnosis algorithms for such systems are presented. It is shown that the complexity of safe diagnosis algorithms is comparable to the diagnosis algorithms for systems performing only fault location.

Original languageEnglish (US)
Title of host publication91 Fault-Tolerant Comput. Symp.
PublisherPubl by IEEE
Pages488-495
Number of pages8
ISBN (Print)0818621508
StatePublished - Jun 1 1991
Event21st International Symposium on Fault-Tolerant Computing - Montreal, Qui, Can
Duration: Jun 25 1991Jun 27 1991

Publication series

NameDigest of Papers - FTCS (Fault-Tolerant Computing Symposium)
ISSN (Print)0731-3071

Other

Other21st International Symposium on Fault-Tolerant Computing
CityMontreal, Qui, Can
Period6/25/916/27/91

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'System level diagnosis: Combining detection and location'. Together they form a unique fingerprint.

Cite this