Light64: Lightweight hardware support for data race detection during systematic testing of parallel programs

Research output: Contribution to journalConference articlepeer-review

Abstract

Developing and testing parallel code is hard. Even for one given input, a parallel program can have many possible different thread interleavings, which are hard for the programmer to foresee and for a testing tool to cover using stress or random testing. For this reason, a recent trend is to use Systematic Testing, which methodically explores different thread interleavings, while checking for various bugs. Data races are common bugs but, unfortunately, checking for races is often skipped in systematic testers because it introduces substantial runtime overhead if done purely in software. Recently, several techniques for race detection in hardware have been proposed, but they still require significant hardware support. This paper presents Light64, a novel technique for data race detection during systematic testing that has both small runtime overhead and very lightweight hardware requirements. Light64 is based on the observation that two thread interleavings in which racing accesses are flipped will very likely exhibit some deviation in their program execution history. Light64 computes a 64-bit hash of the program execution history during systematic testing. If the hashes of two interleavings with the same happens-before graph differ, then a race has occurred. Light64 only needs a 64-bit register per core, a drastic improvement over previous hardware schemes. In addition, our experiments on SPLASH-2 applications show that Light64 has no false positives, detects 96% of races, and induces only a small slowdown for race-free executions - on average, 1% and 37% in two different modes.

Original languageEnglish (US)
Pages (from-to)541-552
Number of pages12
JournalProceedings of the Annual International Symposium on Microarchitecture, MICRO
DOIs
StatePublished - 2009
Event42nd Annual IEEE/ACM International Symposium on Microarchitecture, Micro-42 - New York, NY, United States
Duration: Dec 12 2009Dec 16 2009

Keywords

  • Data race
  • Execution history hash
  • Systematic testing

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Light64: Lightweight hardware support for data race detection during systematic testing of parallel programs'. Together they form a unique fingerprint.

Cite this