TY - GEN
T1 - ScalableBulk
T2 - 43rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2010
AU - Qian, Xuehai
AU - Ahn, Wonsun
AU - Torrellas, Josep
PY - 2010
Y1 - 2010
N2 - Recently-proposed architectures that continuously operate on atomic blocks of instructions (also called chunks) can boost the programmability and performance of shared-memory multiprocessing. However, they must support chunk operations very efficiently. In particular, in lazy conflict-detection environments, it is key that they provide scalable chunk commits. Unfortunately, current proposals typically fail to enable maximum overlap of conflict-free chunk commits. This paper presents a novel directory-based protocol that enables highly-overlapped, scalable chunk commits. The protocol, called Scal-ableBulk, builds on the previously-proposed BulkSC protocol. It introduces three general hardware primitives for scalable commit: preventing access to a set of directory entries, grouping directory modules, and initiating the commit optimistically. Our results with SPLASH-2 and PARSEC codes with up to 64 processors show that ScalableBulk enables highly-overlapped chunk commits and delivers scalable performance. Unlike previously-proposed schemes, it removes practically all commit stalls.
AB - Recently-proposed architectures that continuously operate on atomic blocks of instructions (also called chunks) can boost the programmability and performance of shared-memory multiprocessing. However, they must support chunk operations very efficiently. In particular, in lazy conflict-detection environments, it is key that they provide scalable chunk commits. Unfortunately, current proposals typically fail to enable maximum overlap of conflict-free chunk commits. This paper presents a novel directory-based protocol that enables highly-overlapped, scalable chunk commits. The protocol, called Scal-ableBulk, builds on the previously-proposed BulkSC protocol. It introduces three general hardware primitives for scalable commit: preventing access to a set of directory entries, grouping directory modules, and initiating the commit optimistically. Our results with SPLASH-2 and PARSEC codes with up to 64 processors show that ScalableBulk enables highly-overlapped chunk commits and delivers scalable performance. Unlike previously-proposed schemes, it removes practically all commit stalls.
UR - http://www.scopus.com/inward/record.url?scp=79951707317&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79951707317&partnerID=8YFLogxK
U2 - 10.1109/MICRO.2010.29
DO - 10.1109/MICRO.2010.29
M3 - Conference contribution
AN - SCOPUS:79951707317
SN - 9780769542997
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 447
EP - 458
BT - Proceedings - 43rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2010
Y2 - 4 December 2010 through 8 December 2010
ER -