Abstract
Chip Multiprocessors (CMPs) are flexible, high-frequency platforms on which to support Thread-Level Speculation (TLS). However, for TLS to deliver on its promise, CMPs must exploit multiple sources of speculative task-level parallelism, including any nesting levels of both subroutines and loop iterations. Unfortunately, these environments are hard to support in decentralized CMP hardware: since tasks are spawned out-of-order and unpredictably, maintaining key TLS basics such as task ordering and efficient resource allocation is challenging. While the concept of out-of-order spawning is not new, this paper is the first to propose a set of microarchitectural mechanisms that, altogether, fundamentally enable fast TLS with out-of-order spawn in a CMP. Moreover, we develop a fully-automated TLS compiler for aggressive out-of-order spawn. With our mechanisms, a TLS CMP with four 4-issue cores achieves an average speedup of 1.30 for full SPECint 2000 applications; the corresponding speedup for in-order- only spawn is 1.04. Overall, our mechanisms unlock the potential of TLS for the toughest applications.
Original language | English (US) |
---|---|
Pages | 179-188 |
Number of pages | 10 |
DOIs | |
State | Published - 2005 |
Event | ICS05 - 19th ACM International Conference on Supercomputing - Cambridge, MA, United States Duration: Jun 20 2005 → Jun 22 2005 |
Other
Other | ICS05 - 19th ACM International Conference on Supercomputing |
---|---|
Country/Territory | United States |
City | Cambridge, MA |
Period | 6/20/05 → 6/22/05 |
ASJC Scopus subject areas
- General Computer Science