IPAS: Intelligent protection against silent output corruption in scientific applications

Ignacio Laguna, Martin Schulz, David F. Richards, Jon Calhoun, Luke Olson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents IPAS, an instruction duplication technique that protects scientific applications from silent data corruption (SDC) in their output. The motivation for IPAS is that, due to natural error masking, only a subset of SDC errors actually affects the output of scientific codes-we call these errors silent output corruption (SOC) errors. Thus applications require duplication only on code that, when affected by a fault, yields SOC. We use machine learning to learn code instructions that must be protected to avoid SOC, and, using a compiler, we protect only those vulnerable instructions by duplication, thus significantly reducing the overhead that is introduced by instruction duplication. In our experiments with five workloads, IPAS reduces the percentage of SOC by up to 90% with a slowdown that ranges between 1.04× and 1.35×, which corresponds to as much as 47% less slowdown than state-of-the-art instruction duplication techniques.

Original languageEnglish (US)
Title of host publicationProceedings of the 14th International Symposium on Code Generation and Optimization, CGO 2016
PublisherAssociation for Computing Machinery
Pages227-238
Number of pages12
ISBN (Electronic)9781450337786
DOIs
StatePublished - Feb 29 2016
Event14th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2016 - Barcelona, Spain
Duration: Mar 12 2016Mar 18 2016

Publication series

NameProceedings of the 14th International Symposium on Code Generation and Optimization, CGO 2016

Other

Other14th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2016
Country/TerritorySpain
CityBarcelona
Period3/12/163/18/16

Keywords

  • Compiler analysis
  • High-performance computing
  • Machine learning
  • Resilience

ASJC Scopus subject areas

  • Software
  • Applied Mathematics
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'IPAS: Intelligent protection against silent output corruption in scientific applications'. Together they form a unique fingerprint.

Cite this