TY - GEN
T1 - DeNovoSync
T2 - 20th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2015
AU - Sung, Hyojin
AU - Adve, Sarita V.
N1 - Publisher Copyright:
Copyright © 2015 ACM.
PY - 2015/3/14
Y1 - 2015/3/14
N2 - Current shared-memory hardware is complex and inefficient. Prior work on the DeNovo coherence protocol showed that disciplined shared-memory programming models can enable more complexity-, performance-, and energy-efficient hardware than the state-of-the-art MESI protocol. DeNovo, however, severely restricted the synchronization constructs an application can support. This paper proposes DeNovoSync, a technique to support arbitrary synchronization in DeNovo. The key challenge is that DeNovo exploits race-freedom to use reader-initiated local self-invalidations (instead of conventional writer-initiated remote cache invalidations) to ensure coherence. Synchronization accesses are inherently racy and not directly amenable to self-invalidations. DeNovoSync addresses this challenge using a novel combination of registration of all synchronization reads with a judicious hardware backoff to limit unnecessary registrations. For a wide variety of synchronization constructs and applications, compared to MESI, DeNovoSync shows comparable or up to 22% lower execution time and up to 58% lower network traffic, enabling DeNovo's advantages for a much broader class of software than previously possible.
AB - Current shared-memory hardware is complex and inefficient. Prior work on the DeNovo coherence protocol showed that disciplined shared-memory programming models can enable more complexity-, performance-, and energy-efficient hardware than the state-of-the-art MESI protocol. DeNovo, however, severely restricted the synchronization constructs an application can support. This paper proposes DeNovoSync, a technique to support arbitrary synchronization in DeNovo. The key challenge is that DeNovo exploits race-freedom to use reader-initiated local self-invalidations (instead of conventional writer-initiated remote cache invalidations) to ensure coherence. Synchronization accesses are inherently racy and not directly amenable to self-invalidations. DeNovoSync addresses this challenge using a novel combination of registration of all synchronization reads with a judicious hardware backoff to limit unnecessary registrations. For a wide variety of synchronization constructs and applications, compared to MESI, DeNovoSync shows comparable or up to 22% lower execution time and up to 58% lower network traffic, enabling DeNovo's advantages for a much broader class of software than previously possible.
UR - http://www.scopus.com/inward/record.url?scp=84939150628&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84939150628&partnerID=8YFLogxK
U2 - 10.1145/2694344.2694356
DO - 10.1145/2694344.2694356
M3 - Conference contribution
AN - SCOPUS:84939150628
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 545
EP - 559
BT - ASPLOS 2015 - 20th International Conference on Architectural Support for Programming Languages and Operating Systems
PB - Association for Computing Machinery
Y2 - 14 March 2015 through 18 March 2015
ER -