FINE: A Fault Injection and Monitoring Environment for Tracing the UNIX System Behavior under Faults

Wei lun Kao, Ravishankar K. Iyer

Research output: Contribution to journalArticle

Abstract

Fault injection has been used to evaluate the dependability of computer systems, but most fault-injection studies concentrate on the final impact of faults on the system with an emphasis on fault latency and coverage issues. What really happens after a fault is injected and how a fault propagates in a software system are not well understood. This paper presents a fault injection and monitoring environment (FINE) as a tool to study fault propagation in the UNIX kernel. FINE injects hardware-induced software errors and software faults into the UNIX kernel and traces the execution flow and key variables of the kernel. It consists of a fault injector, a software monitor, a workload generator, a controller, and several analysis utilities. Experiments on SunOS 4.1.2 are conducted by applying FINE to investigate fault propagation and to evaluate the impact of various types of faults. Fault propagation models are built for both hardware and software faults. Transient Markov reward analysis is performed based on the models to evaluate the loss of performance due to an injected fault. Experimental results show that memory faults and software faults usually have a very long latency while bus faults and CPU faults tend to crash the system immediately. About half of the detected errors are data faults, which are detected when the system tries to access an unauthorized memory location. Only about 8% of faults propagate to other UNIX subsystems. Markoy reward analysis shows that the performance loss incurred by bus faults and CPU faults is much higher than that incurred by software and memory faults. Among software faults, the impact of pointer faults is higher than that of nonpointer faults.

Original languageEnglish (US)
Pages (from-to)1105-1118
Number of pages14
JournalIEEE Transactions on Software Engineering
Volume19
Issue number11
DOIs
StatePublished - Nov 1993

Keywords

  • Fault/error injection
  • UNIX kernel
  • fault modeling
  • fault propagation modeling
  • fault/error propagation
  • monitor
  • software
  • transient reward analysis

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'FINE: A Fault Injection and Monitoring Environment for Tracing the UNIX System Behavior under Faults'. Together they form a unique fingerprint.

  • Cite this