GPGPUs: How to combine high computational power with high reliability

L. Bautista Gomez, F. Cappello, L. Carro, N. Debardeleben, B. Fang, S. Gurumurthi, K. Pattabiraman, P. Rech, M. Sonza Reorda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

GPGPUs are used increasingly in several domains, from gaming to different kinds of computationally intensive applications. In many applications GPGPU reliability is becoming a serious issue, and several research activities are focusing on its evaluation. This paper offers an overview of some major results in the area. First, it shows and analyzes the results of some experiments assessing GPGPU reliability in HPC datacenters. Second, it provides some recent results derived from radiation experiments about the reliability of GPGPUs. Third, it describes the characteristics of an advanced fault-injection environment, allowing effective evaluation of the resiliency of applications running on GPGPUs.

Original languageEnglish (US)
Title of host publicationProceedings - Design, Automation and Test in Europe, DATE 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Print)9783981537024
DOIs
StatePublished - 2014
Externally publishedYes
Event17th Design, Automation and Test in Europe, DATE 2014 - Dresden, Germany
Duration: Mar 24 2014Mar 28 2014

Publication series

NameProceedings -Design, Automation and Test in Europe, DATE
ISSN (Print)1530-1591

Other

Other17th Design, Automation and Test in Europe, DATE 2014
Country/TerritoryGermany
CityDresden
Period3/24/143/28/14

Keywords

  • fault injection
  • GPGPUs
  • HPC
  • radiation
  • reliability

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'GPGPUs: How to combine high computational power with high reliability'. Together they form a unique fingerprint.

Cite this