TY - GEN
T1 - Asymmetric memory fences
T2 - 20th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2015
AU - Duan, Yuelu
AU - Honarmand, Nima
AU - Torrellas, Josep
N1 - Publisher Copyright:
Copyright © 2015 ACM.
PY - 2015/3/14
Y1 - 2015/3/14
N2 - There have been several recent efforts to improve the performance of fences. The most aggressive designs allow post-fence accesses to retire and complete before the fence completes. Unfortunately, such designs present implementation difficulties due to their reliance on global state and structures. This paper's goal is to optimize both the performance and the implementability of fences. We start-off with a design like the most aggressive ones but without the global state. We call it Weak Fence or wF. Since the concurrent execution of multiple wFs can deadlock, we combine wFs with a conventional fence (i.e., Strong Fence or sF) for the less performance-critical thread(s). We call the result an Asymmetric fence group. We also propose a taxonomy of Asymmetric fence groups under TSO. Compared to past aggressive fences, Asymmetric fence groups both are substantially easier to implement and have higher average performance. The two main designs presented (WS+ and W+) speed-up workloads under TSO by an average of 13% and 21%, respectively, over conventional fences.
AB - There have been several recent efforts to improve the performance of fences. The most aggressive designs allow post-fence accesses to retire and complete before the fence completes. Unfortunately, such designs present implementation difficulties due to their reliance on global state and structures. This paper's goal is to optimize both the performance and the implementability of fences. We start-off with a design like the most aggressive ones but without the global state. We call it Weak Fence or wF. Since the concurrent execution of multiple wFs can deadlock, we combine wFs with a conventional fence (i.e., Strong Fence or sF) for the less performance-critical thread(s). We call the result an Asymmetric fence group. We also propose a taxonomy of Asymmetric fence groups under TSO. Compared to past aggressive fences, Asymmetric fence groups both are substantially easier to implement and have higher average performance. The two main designs presented (WS+ and W+) speed-up workloads under TSO by an average of 13% and 21%, respectively, over conventional fences.
KW - Fences
KW - Parallel programming
KW - Sequential consistency
KW - Shared-memory machines
KW - Synchronization
UR - http://www.scopus.com/inward/record.url?scp=84939189548&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84939189548&partnerID=8YFLogxK
U2 - 10.1145/2694344.2694388
DO - 10.1145/2694344.2694388
M3 - Conference contribution
AN - SCOPUS:84939189548
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 531
EP - 543
BT - ASPLOS 2015 - 20th International Conference on Architectural Support for Programming Languages and Operating Systems
PB - Association for Computing Machinery
Y2 - 14 March 2015 through 18 March 2015
ER -