Characterizing the effects of transient faults on a high-performance processor pipeline

Nicholas J. Wang, Justin Quek, Todd M. Rafacz, Sanjay Jeram Patel

Research output: Contribution to conferencePaper

Abstract

The progression of implementation technologies into the sub-100 nanometer lithographies renew the importance of understanding and protecting against single-event upsets in digital systems. In this work, the effects of transient faults on high performance microprocessors is explored. To perform a thorough exploration, a highly detailed register transfer level model of a deeply pipelined, out-of-order microprocessor was created. Using fault injection, we determined that fewer than 15% of single bit corruptions in processor state result in software visible errors. These failures were analyzed to identify the most vulnerable portions of the processor, which were then protected using simple low-overhead techniques. This resulted in a 75% reduction in failures. Building upon the failure modes seen in the microarchitecture, fault injections into software were performed to investigate the level of masking that the software layer provides. Together, the baseline microarchitectural substrate and software mask more than 9 out of 10 transient faults from affecting correct program execution.

Original languageEnglish (US)
Pages61-70
Number of pages10
StatePublished - Oct 1 2004
Event2004 International Conference on Dependable Systems and Networks - Florence, Italy
Duration: Jun 28 2004Jul 1 2004

Other

Other2004 International Conference on Dependable Systems and Networks
CountryItaly
CityFlorence
Period6/28/047/1/04

Fingerprint

Microprocessor chips
Pipelines
Failure modes
Masks
Substrates

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Wang, N. J., Quek, J., Rafacz, T. M., & Patel, S. J. (2004). Characterizing the effects of transient faults on a high-performance processor pipeline. 61-70. Paper presented at 2004 International Conference on Dependable Systems and Networks, Florence, Italy.

Characterizing the effects of transient faults on a high-performance processor pipeline. / Wang, Nicholas J.; Quek, Justin; Rafacz, Todd M.; Patel, Sanjay Jeram.

2004. 61-70 Paper presented at 2004 International Conference on Dependable Systems and Networks, Florence, Italy.

Research output: Contribution to conferencePaper

Wang, NJ, Quek, J, Rafacz, TM & Patel, SJ 2004, 'Characterizing the effects of transient faults on a high-performance processor pipeline', Paper presented at 2004 International Conference on Dependable Systems and Networks, Florence, Italy, 6/28/04 - 7/1/04 pp. 61-70.
Wang NJ, Quek J, Rafacz TM, Patel SJ. Characterizing the effects of transient faults on a high-performance processor pipeline. 2004. Paper presented at 2004 International Conference on Dependable Systems and Networks, Florence, Italy.
Wang, Nicholas J. ; Quek, Justin ; Rafacz, Todd M. ; Patel, Sanjay Jeram. / Characterizing the effects of transient faults on a high-performance processor pipeline. Paper presented at 2004 International Conference on Dependable Systems and Networks, Florence, Italy.10 p.
@conference{3af49193044b4c3a92437a93113d260c,
title = "Characterizing the effects of transient faults on a high-performance processor pipeline",
abstract = "The progression of implementation technologies into the sub-100 nanometer lithographies renew the importance of understanding and protecting against single-event upsets in digital systems. In this work, the effects of transient faults on high performance microprocessors is explored. To perform a thorough exploration, a highly detailed register transfer level model of a deeply pipelined, out-of-order microprocessor was created. Using fault injection, we determined that fewer than 15{\%} of single bit corruptions in processor state result in software visible errors. These failures were analyzed to identify the most vulnerable portions of the processor, which were then protected using simple low-overhead techniques. This resulted in a 75{\%} reduction in failures. Building upon the failure modes seen in the microarchitecture, fault injections into software were performed to investigate the level of masking that the software layer provides. Together, the baseline microarchitectural substrate and software mask more than 9 out of 10 transient faults from affecting correct program execution.",
author = "Wang, {Nicholas J.} and Justin Quek and Rafacz, {Todd M.} and Patel, {Sanjay Jeram}",
year = "2004",
month = "10",
day = "1",
language = "English (US)",
pages = "61--70",
note = "2004 International Conference on Dependable Systems and Networks ; Conference date: 28-06-2004 Through 01-07-2004",

}

TY - CONF

T1 - Characterizing the effects of transient faults on a high-performance processor pipeline

AU - Wang, Nicholas J.

AU - Quek, Justin

AU - Rafacz, Todd M.

AU - Patel, Sanjay Jeram

PY - 2004/10/1

Y1 - 2004/10/1

N2 - The progression of implementation technologies into the sub-100 nanometer lithographies renew the importance of understanding and protecting against single-event upsets in digital systems. In this work, the effects of transient faults on high performance microprocessors is explored. To perform a thorough exploration, a highly detailed register transfer level model of a deeply pipelined, out-of-order microprocessor was created. Using fault injection, we determined that fewer than 15% of single bit corruptions in processor state result in software visible errors. These failures were analyzed to identify the most vulnerable portions of the processor, which were then protected using simple low-overhead techniques. This resulted in a 75% reduction in failures. Building upon the failure modes seen in the microarchitecture, fault injections into software were performed to investigate the level of masking that the software layer provides. Together, the baseline microarchitectural substrate and software mask more than 9 out of 10 transient faults from affecting correct program execution.

AB - The progression of implementation technologies into the sub-100 nanometer lithographies renew the importance of understanding and protecting against single-event upsets in digital systems. In this work, the effects of transient faults on high performance microprocessors is explored. To perform a thorough exploration, a highly detailed register transfer level model of a deeply pipelined, out-of-order microprocessor was created. Using fault injection, we determined that fewer than 15% of single bit corruptions in processor state result in software visible errors. These failures were analyzed to identify the most vulnerable portions of the processor, which were then protected using simple low-overhead techniques. This resulted in a 75% reduction in failures. Building upon the failure modes seen in the microarchitecture, fault injections into software were performed to investigate the level of masking that the software layer provides. Together, the baseline microarchitectural substrate and software mask more than 9 out of 10 transient faults from affecting correct program execution.

UR - http://www.scopus.com/inward/record.url?scp=4544282186&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4544282186&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:4544282186

SP - 61

EP - 70

ER -