TY - JOUR
T1 - Need for fast communication in hardware-based speculative chip multiprocessors
AU - Krishnan, Venkata
AU - Torrellas, Josep
PY - 1999
Y1 - 1999
N2 - Chip-multiprocessor (CMP) architectures are a promising design alternative to exploit the ever-increasing number of transistors that can be put on a die. To deliver high performance on applications that cannot be easily parallelized, CMPs can use additional support for speculatively executing the possibly data-dependent threads of an application. While some of the cross-thread dependences in applications must be handled dynamically, others can be fully determined by the compiler. For the latter dependences, the threads can be made to synchronize and communicate either at the register level or at the memory level. In the past, it has been unclear whether the higher hardware cost of register-level communication is cost-effective. In this paper, we show that the wide-issue dynamic processors that will soon populate CMPs, make fast communication a requirement for high performance. Consequently, we propose an effective hardware mechanism to support communication and synchronization of registers between on-chip processors. Our scheme adds enough support to enable register-level communication without specializing the architecture so much toward speculation that it leads to much unutilized hardware under workloads that do not need speculative parallelization. Finally, the scheme allows the system to achieve near ideal performance.
AB - Chip-multiprocessor (CMP) architectures are a promising design alternative to exploit the ever-increasing number of transistors that can be put on a die. To deliver high performance on applications that cannot be easily parallelized, CMPs can use additional support for speculatively executing the possibly data-dependent threads of an application. While some of the cross-thread dependences in applications must be handled dynamically, others can be fully determined by the compiler. For the latter dependences, the threads can be made to synchronize and communicate either at the register level or at the memory level. In the past, it has been unclear whether the higher hardware cost of register-level communication is cost-effective. In this paper, we show that the wide-issue dynamic processors that will soon populate CMPs, make fast communication a requirement for high performance. Consequently, we propose an effective hardware mechanism to support communication and synchronization of registers between on-chip processors. Our scheme adds enough support to enable register-level communication without specializing the architecture so much toward speculation that it leads to much unutilized hardware under workloads that do not need speculative parallelization. Finally, the scheme allows the system to achieve near ideal performance.
UR - http://www.scopus.com/inward/record.url?scp=0033362289&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0033362289&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:0033362289
SP - 24
EP - 33
JO - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
JF - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
SN - 1089-795X
ER -