TY - GEN
T1 - Cross-Layer Resilience: Challenges, Insights, and the Road Ahead
AU - Cheng, Eric
AU - Mueller-Gritschneder, Daniel
AU - Abraham, Jacob
AU - Bose, Pradip
AU - Buyuktosunoglu, Alper
AU - Chen, Deming
AU - Cho, Hyungmin
AU - Li, Yanjing
AU - Sharif, Uzair
AU - Skadron, Kevin
AU - Stan, Mircea
AU - Schlichtmann, Ulf
AU - Mitra, Subhasish
N1 - Funding Information:
This research was supported in part by Defense Advanced Research Projects Agency (IBM, Stanford, U. Virginia). The views, opinions and/or findings expressed are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government. Other sponsors include DFG SPP1500 (TU Munich), DTRA (Stanford), NSF (Stanford), and SRC (Stanford, UIUC).
Publisher Copyright:
© 2018 Copyright held by the owner/author(s).
PY - 2019/6/2
Y1 - 2019/6/2
N2 - Resilience to errors in the underlying hardware is a key design objective for a large class of computing systems, from embedded systems all the way to the cloud. Sources of hardware errors include radiation, circuit aging, variability induced by manufacturing and operating conditions, manufacturing test escapes, and early-life failures. Many publications have suggested that cross-layer resilience, where multiple error resilience techniques from different layers of the system stack cooperate to achieve cost-effective resilience, is essential for designing cost-effective resilient digital systems. This paper presents a comprehensive overview of crosslayer resilience by addressing fundamental cross-layer resilience questions, by summarizing insights derived from recent advances in cross-layer resilience research, and by discussing future crosslayer resilience challenges.
AB - Resilience to errors in the underlying hardware is a key design objective for a large class of computing systems, from embedded systems all the way to the cloud. Sources of hardware errors include radiation, circuit aging, variability induced by manufacturing and operating conditions, manufacturing test escapes, and early-life failures. Many publications have suggested that cross-layer resilience, where multiple error resilience techniques from different layers of the system stack cooperate to achieve cost-effective resilience, is essential for designing cost-effective resilient digital systems. This paper presents a comprehensive overview of crosslayer resilience by addressing fundamental cross-layer resilience questions, by summarizing insights derived from recent advances in cross-layer resilience research, and by discussing future crosslayer resilience challenges.
KW - Cross-layer resilience
KW - Fault tolerance
KW - Reliability
UR - http://www.scopus.com/inward/record.url?scp=85067795200&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85067795200&partnerID=8YFLogxK
U2 - 10.1145/3316781.3323474
DO - 10.1145/3316781.3323474
M3 - Conference contribution
AN - SCOPUS:85067795200
T3 - Proceedings - Design Automation Conference
BT - Proceedings of the 56th Annual Design Automation Conference 2019, DAC 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 56th Annual Design Automation Conference, DAC 2019
Y2 - 2 June 2019 through 6 June 2019
ER -