TY - GEN
T1 - HeteroSync
T2 - 2017 IEEE International Symposium on Workload Characterization, IISWC 2017
AU - Sinclair, Matthew D.
AU - Alsop, Johnathan
AU - Adve, Sarita V.
PY - 2017/12/5
Y1 - 2017/12/5
N2 - Traditionally GPUs focused on streaming, data-parallel applications, with little data reuse or sharing and coarse-grained synchronization. However, the rise of general-purpose GPU (GPGPU) computing has made GPUs desirable for applications with more general sharing patterns and fine-grained synchronization, especially for recent GPUs that have a unified address space and coherent caches. Prior work has introduced microbenchmarks to measure the impact of these changes, but each paper uses its own set of microbenchmarks. In this work, we combine several of these sets together in a single suite, HeteroSync. HeteroSync includes several synchronization primitives, data sharing at different levels of the memory hierarchy, and relaxed atomics. We characterize the scalability of HeteroSync for different coherence protocols and consistency models on modern, tightly coupled CPU-GPU systems and show that certain algorithms, coherence protocols, and consistency models scale better than others.
AB - Traditionally GPUs focused on streaming, data-parallel applications, with little data reuse or sharing and coarse-grained synchronization. However, the rise of general-purpose GPU (GPGPU) computing has made GPUs desirable for applications with more general sharing patterns and fine-grained synchronization, especially for recent GPUs that have a unified address space and coherent caches. Prior work has introduced microbenchmarks to measure the impact of these changes, but each paper uses its own set of microbenchmarks. In this work, we combine several of these sets together in a single suite, HeteroSync. HeteroSync includes several synchronization primitives, data sharing at different levels of the memory hierarchy, and relaxed atomics. We characterize the scalability of HeteroSync for different coherence protocols and consistency models on modern, tightly coupled CPU-GPU systems and show that certain algorithms, coherence protocols, and consistency models scale better than others.
UR - http://www.scopus.com/inward/record.url?scp=85046538842&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046538842&partnerID=8YFLogxK
U2 - 10.1109/IISWC.2017.8167781
DO - 10.1109/IISWC.2017.8167781
M3 - Conference contribution
AN - SCOPUS:85046538842
T3 - Proceedings of the 2017 IEEE International Symposium on Workload Characterization, IISWC 2017
SP - 239
EP - 249
BT - Proceedings of the 2017 IEEE International Symposium on Workload Characterization, IISWC 2017
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 1 October 2017 through 3 October 2017
ER -