Dependability analysis of a high-speed network using software-implemented fault injection and simulated fault injection

David T. Stott, Greg Ries, Mei Chen Hsueh, Ravishankar K. Iyer

Research output: Contribution to journalArticlepeer-review

Abstract

This paper presents a dependability study of high-speed, switched Local Area Networks (LANs) using Myrinet as an example testbed (with theoretical speeds of 2.56 Gbps). The study uses results of two fault injection methods, simulated fault injection and software-implemented fault injection (SWIFI), to analyze the application-level impact of transient faults injected into the network interface hardware. These results include a number of errors, such as dropped or corrupt messages, host interface or host resets, and local or remote host interface hangs. The paper presents the study in two parts: First, the results from the SWIFI method in the real system are used as a basis to validate the simulation and identify the major factors leading to differences between the methods. A comparison between the two injection methods shows that they agree for 83 percent of the fault injections. The results, however, vary greatly, depending on the fault type considered. The study also presents an analysis of the effects of varying workload intensity, host platform, and interface function targeted by the injection. An example of this analysis is to show that the function targeted has a significant impact on the fault activation rate. Finally, the study identifies two mechanisms by which faults may propagate from the interface to other parts of the network; in one example, this propagation caused the interface's host computer to reboot, while another caused a remote interface in the network to hang.

Original languageEnglish (US)
Pages (from-to)108-119
Number of pages12
JournalIEEE Transactions on Computers
Volume47
Issue number1
DOIs
StatePublished - 1998

Keywords

  • Dependability
  • Embedded system
  • Fault effect
  • Fault simulation
  • Myrinet
  • SWIFI

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Dependability analysis of a high-speed network using software-implemented fault injection and simulated fault injection'. Together they form a unique fingerprint.

Cite this