Measurement-based analysis of system dependability using fault injection and field failure data

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The discussion in this paper focuses on the issues involved in analyzing the availability of networked systems using fault injection and the failure data collected by the logging mechanisms built into the system. In particular we address: (1) analysis in the prototype phase using physical fault injection to an actual system. We use example of fault injection-based evaluation of a software-implemented fault tolerance (SIFT) environment (built around a set of self-checking processes called ARMORS) that provides error detection and recovery services to spaceborne scientific applications and (2) measurement-based analysis of systems in the field. We use example of LAN of Windows NT based computers to present methods for collecting and analyzing failure data to characterize network system dependability. Both, fault injection and failure data analysis enable us to study naturally occurring errors and to provide feedback to system designers on potential availability bottlenecks. For example, the study of failures in a network of Windows NT machines reveals that most of the problems that lead to reboots are software related and that though the average availability evaluates to over 99%, a typical machine, on average, provides acceptable service only about 92% of the time.

Original languageEnglish (US)
Title of host publicationPerformance Evaluation of Complex Systems
Subtitle of host publicationTechniques and Tools - Performance 2002 Tutorial Lectures
EditorsMaria Carla Calzarossa, Salvatore Tucci
PublisherSpringer-Verlag Berlin Heidelberg
Pages290-317
Number of pages28
ISBN (Print)9783540442523
DOIs
StatePublished - 2002
EventIFIP WG 7.3 International Symposium on Computer Modeling, Measurement, and Evaluation, Performance 2002 - Rome, Italy
Duration: Sep 23 2002Sep 27 2002

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2459
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

OtherIFIP WG 7.3 International Symposium on Computer Modeling, Measurement, and Evaluation, Performance 2002
CountryItaly
CityRome
Period9/23/029/27/02

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Measurement-based analysis of system dependability using fault injection and field failure data'. Together they form a unique fingerprint.

  • Cite this

    Iyer, R. K., & Kalbarczyk, Z. (2002). Measurement-based analysis of system dependability using fault injection and field failure data. In M. C. Calzarossa, & S. Tucci (Eds.), Performance Evaluation of Complex Systems: Techniques and Tools - Performance 2002 Tutorial Lectures (pp. 290-317). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2459). Springer-Verlag Berlin Heidelberg. https://doi.org/10.1007/3-540-45798-4_13