TY - GEN
T1 - Bulk disambiguation of speculative threads in multiprocessors
AU - Ceze, Luis
AU - Tuck, James
AU - Caşcaval, Cǎlin
AU - Torrellas, Josep
PY - 2006
Y1 - 2006
N2 - Transactional Memory (TM), Thread-Level Speculation (TLS), and Checkpointed multiprocessors are three popular architectural techniques based on the execution of multiple, cooperating speculative threads. In these environments, correctly maintaining data dependences across threads requires mechanisms for disambiguating addresses across threads, invalidating stale cache state, and making committed state visible. These mechanisms are both conceptually involved and hard to implement. In this paper, we present Bulk, a novel approach to simplify these mechanisms. The idea is to hash-encode a thread's access information in a concise signature, and then support in hardware signature operations that efficiently process sets of addresses. Such operations implement the mechanisms described. Bulk operations are inexact but correct, and provide substantial conceptual and implementation simplicity. We evaluate Bulk in the context of TLS using SPECint2000 codes and TM using multithreaded Java workloads. Despite its simplicity, Bulk has competitive performance with more complex schemes. We also find that signature configuration is a key design parameter.
AB - Transactional Memory (TM), Thread-Level Speculation (TLS), and Checkpointed multiprocessors are three popular architectural techniques based on the execution of multiple, cooperating speculative threads. In these environments, correctly maintaining data dependences across threads requires mechanisms for disambiguating addresses across threads, invalidating stale cache state, and making committed state visible. These mechanisms are both conceptually involved and hard to implement. In this paper, we present Bulk, a novel approach to simplify these mechanisms. The idea is to hash-encode a thread's access information in a concise signature, and then support in hardware signature operations that efficiently process sets of addresses. Such operations implement the mechanisms described. Bulk operations are inexact but correct, and provide substantial conceptual and implementation simplicity. We evaluate Bulk in the context of TLS using SPECint2000 codes and TM using multithreaded Java workloads. Despite its simplicity, Bulk has competitive performance with more complex schemes. We also find that signature configuration is a key design parameter.
UR - http://www.scopus.com/inward/record.url?scp=33845866604&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33845866604&partnerID=8YFLogxK
U2 - 10.1109/ISCA.2006.13
DO - 10.1109/ISCA.2006.13
M3 - Conference contribution
AN - SCOPUS:33845866604
SN - 076952608X
SN - 9780769526089
T3 - Proceedings - International Symposium on Computer Architecture
SP - 227
EP - 238
BT - Proceedings - 33rd International Symposium on Computer Architecture,ISCA 2006
T2 - 33rd International Symposium on Computer Architecture, ISCA 2006
Y2 - 17 June 2006 through 21 June 2006
ER -