TY - GEN
T1 - BulkSC
T2 - ISCA'07: 34th Annual International Symposium on Computer Architecture
AU - Ceze, Luis
AU - Tuck, James
AU - Montesinos, Pablo
AU - Torrellas, Josep
PY - 2007
Y1 - 2007
N2 - While Sequential Consistency (SC) is the most intuitive memory consistency model and the one most programmers likely assume, current multiprocessors do not support it. Instead, they support more relaxed models that deliver high performance. SC implementations are considered either too slow or - when they can match the performance of relaxed models - too difficult to implement. In this paper, we propose Bulk Enforcement of SC (BulkSC), anovel way of providing SC that is simple to implement and offers performance comparable to Release Consistency (RC). The idea is to dynamically group sets of consecutive instructions into chunks that appear to execute atomically and in isolation. The hardware enforces SC at the coarse grain of chunks which, to the program, appears as providing SC at the individual memory access level. BulkSC keeps the implementation simple by largely decoupling memory consistency enforcement from processor structures. Moreover, it delivers high performance by enabling full memory access reordering and overlapping within chunks and across chunks. We describe a complete system architecture that supports BulkSC and show that it delivers performance comparable to RC.
AB - While Sequential Consistency (SC) is the most intuitive memory consistency model and the one most programmers likely assume, current multiprocessors do not support it. Instead, they support more relaxed models that deliver high performance. SC implementations are considered either too slow or - when they can match the performance of relaxed models - too difficult to implement. In this paper, we propose Bulk Enforcement of SC (BulkSC), anovel way of providing SC that is simple to implement and offers performance comparable to Release Consistency (RC). The idea is to dynamically group sets of consecutive instructions into chunks that appear to execute atomically and in isolation. The hardware enforces SC at the coarse grain of chunks which, to the program, appears as providing SC at the individual memory access level. BulkSC keeps the implementation simple by largely decoupling memory consistency enforcement from processor structures. Moreover, it delivers high performance by enabling full memory access reordering and overlapping within chunks and across chunks. We describe a complete system architecture that supports BulkSC and show that it delivers performance comparable to RC.
KW - Bulk
KW - Chip multiprocessors
KW - Memory consistency models
KW - Programmability
KW - Sequential consistency
UR - http://www.scopus.com/inward/record.url?scp=35348862407&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35348862407&partnerID=8YFLogxK
U2 - 10.1145/1250662.1250697
DO - 10.1145/1250662.1250697
M3 - Conference contribution
AN - SCOPUS:35348862407
SN - 1595937064
SN - 9781595937063
T3 - Proceedings - International Symposium on Computer Architecture
SP - 278
EP - 289
BT - ISCA'07
Y2 - 9 June 2007 through 13 June 2007
ER -