ReEnact: Using thread-level speculation mechanisms to debug data races in multithreaded codes

Milos Prvulovic, Josep Torrellas

Research output: Contribution to journalConference article

Abstract

While removing software bugs consumes vast amounts of human time, hardware support for debugging in modern computers remains rudimentary. Fortunately, we show that mechanisms for Thread-Level Speculation (TLS) can be reused to boost debugging productivity. Most notably, TLS's rollback capabilities can be extended to support rolling back recent buggy execution and repeating it as many times as necessary until the bug is fully characterized. These incremental re-executions are deterministic even in multithreaded codes. Importantly, this operation can be done automatically on the fly, and is compatible with production runs. As a specific implementation of a TLS-based debugging framework, we introduce ReEnact. ReEnact targets a particularly hairy class of bugs: data races in multithreaded programs. ReEnact extends the communication monitoring mechanisms in TLS to also detect data races. It extends TLS's rollback capabilities to be able to roll back and deterministically re-execute the code with races to obtain the race signature. Finally, the signature is compared to a library of race patterns and, if a match occurs, the execution may be repaired. Overall, ReEnact successfully detects, characterizes, and often repairs races automatically on the fly. Moreover, it is fully compatible with always-on use in production runs: the slowdown of race-free execution with ReEnact is on average only 5.8%.

Original languageEnglish (US)
Pages (from-to)110-121
Number of pages12
JournalConference Proceedings - Annual International Symposium on Computer Architecture, ISCA
StatePublished - Jul 17 2003
Event30th Annual International Symposium on Computer Architecture - San Diego, CA, United States
Duration: Jun 9 2003Jun 11 2003

Fingerprint

Computer debugging
Computer hardware
Repair
Productivity
Monitoring
Communication

ASJC Scopus subject areas

  • Hardware and Architecture

Cite this

@article{fc10c47b75da4245a839088218f75aac,
title = "ReEnact: Using thread-level speculation mechanisms to debug data races in multithreaded codes",
abstract = "While removing software bugs consumes vast amounts of human time, hardware support for debugging in modern computers remains rudimentary. Fortunately, we show that mechanisms for Thread-Level Speculation (TLS) can be reused to boost debugging productivity. Most notably, TLS's rollback capabilities can be extended to support rolling back recent buggy execution and repeating it as many times as necessary until the bug is fully characterized. These incremental re-executions are deterministic even in multithreaded codes. Importantly, this operation can be done automatically on the fly, and is compatible with production runs. As a specific implementation of a TLS-based debugging framework, we introduce ReEnact. ReEnact targets a particularly hairy class of bugs: data races in multithreaded programs. ReEnact extends the communication monitoring mechanisms in TLS to also detect data races. It extends TLS's rollback capabilities to be able to roll back and deterministically re-execute the code with races to obtain the race signature. Finally, the signature is compared to a library of race patterns and, if a match occurs, the execution may be repaired. Overall, ReEnact successfully detects, characterizes, and often repairs races automatically on the fly. Moreover, it is fully compatible with always-on use in production runs: the slowdown of race-free execution with ReEnact is on average only 5.8{\%}.",
author = "Milos Prvulovic and Josep Torrellas",
year = "2003",
month = "7",
day = "17",
language = "English (US)",
pages = "110--121",
journal = "Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA",
issn = "1063-6897",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - ReEnact

T2 - Using thread-level speculation mechanisms to debug data races in multithreaded codes

AU - Prvulovic, Milos

AU - Torrellas, Josep

PY - 2003/7/17

Y1 - 2003/7/17

N2 - While removing software bugs consumes vast amounts of human time, hardware support for debugging in modern computers remains rudimentary. Fortunately, we show that mechanisms for Thread-Level Speculation (TLS) can be reused to boost debugging productivity. Most notably, TLS's rollback capabilities can be extended to support rolling back recent buggy execution and repeating it as many times as necessary until the bug is fully characterized. These incremental re-executions are deterministic even in multithreaded codes. Importantly, this operation can be done automatically on the fly, and is compatible with production runs. As a specific implementation of a TLS-based debugging framework, we introduce ReEnact. ReEnact targets a particularly hairy class of bugs: data races in multithreaded programs. ReEnact extends the communication monitoring mechanisms in TLS to also detect data races. It extends TLS's rollback capabilities to be able to roll back and deterministically re-execute the code with races to obtain the race signature. Finally, the signature is compared to a library of race patterns and, if a match occurs, the execution may be repaired. Overall, ReEnact successfully detects, characterizes, and often repairs races automatically on the fly. Moreover, it is fully compatible with always-on use in production runs: the slowdown of race-free execution with ReEnact is on average only 5.8%.

AB - While removing software bugs consumes vast amounts of human time, hardware support for debugging in modern computers remains rudimentary. Fortunately, we show that mechanisms for Thread-Level Speculation (TLS) can be reused to boost debugging productivity. Most notably, TLS's rollback capabilities can be extended to support rolling back recent buggy execution and repeating it as many times as necessary until the bug is fully characterized. These incremental re-executions are deterministic even in multithreaded codes. Importantly, this operation can be done automatically on the fly, and is compatible with production runs. As a specific implementation of a TLS-based debugging framework, we introduce ReEnact. ReEnact targets a particularly hairy class of bugs: data races in multithreaded programs. ReEnact extends the communication monitoring mechanisms in TLS to also detect data races. It extends TLS's rollback capabilities to be able to roll back and deterministically re-execute the code with races to obtain the race signature. Finally, the signature is compared to a library of race patterns and, if a match occurs, the execution may be repaired. Overall, ReEnact successfully detects, characterizes, and often repairs races automatically on the fly. Moreover, it is fully compatible with always-on use in production runs: the slowdown of race-free execution with ReEnact is on average only 5.8%.

UR - http://www.scopus.com/inward/record.url?scp=0038346243&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0038346243&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:0038346243

SP - 110

EP - 121

JO - Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA

JF - Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA

SN - 1063-6897

ER -