Abstract
This paper presents a systematic methodology to investigate the dependability of operational software. The methodology combines several techniques. Time series analysis is used to characterize the occurrence of software failures. Markov reward modeling is used to determine the loss in service due to failures of software components, and to identify major bottlenecks. The effectiveness of built-in fault tolerance is also evaluated. The methodology is illustrated using the software halt data from the Tandem GUARDIAN operating system. The results show that the occurrences of software halts are not correlated with each other in time. Interrupt handling and memory management are found to be the major bottlenecks in the measured system. The fault tolerance in the measured system was shown to reduce the service loss by nearly 90%.
Original language | English (US) |
---|---|
Article number | 285887 |
Pages (from-to) | 227-236 |
Number of pages | 10 |
Journal | Proceedings - International Symposium on Software Reliability Engineering, ISSRE |
DOIs | |
State | Published - 1992 |
Event | 3rd International Symposium on Software Reliability Engineering, ISSRE 1992 - Research Triangle Park, United States Duration: Oct 7 1992 → Oct 10 1992 |
ASJC Scopus subject areas
- Software
- Safety, Risk, Reliability and Quality