TY - GEN
T1 - Locality and balance for communication-aware thread mapping in multicore systems
AU - Diener, Matthias
AU - Cruz, Eduardo H.M.
AU - Alves, Marco A.Z.
AU - Alhakeem, Mohammad S.
AU - Navaux, Philippe O.A.
AU - Heiß, Hans Ulrich
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2015.
PY - 2015
Y1 - 2015
N2 - In multicore architectures, deciding where to execute the threads of parallel applications is increasingly a significant challenge. This thread mapping has a large impact on the application’s performance and energy consumption. Recent research in this area mostly focuses on improving the locality of memory accesses and optimizing the use of shared caches by mapping threads that frequently communicate with each other to processing units that are closer to each other in the memory hierarchy. However, locality-based policies can lead to a substantial performance reduction in some cases due to communication imbalance. In this paper, we perform a comprehensive exploration of communicationaware thread mapping policies in multicore architectures. We develop a set of metrics to evaluate the communication behavior of parallel applications, and describe how these metrics can be used to favor locality-based or balance-based mapping policies. Based on these metrics, we introduce a novel mapping policy that combines locality and balance aspects and achieves the highest overall improvements. We provide an experimental evaluation of the performance gains using different mapping policies as well as a detailed analysis of the sources of energy savings.
AB - In multicore architectures, deciding where to execute the threads of parallel applications is increasingly a significant challenge. This thread mapping has a large impact on the application’s performance and energy consumption. Recent research in this area mostly focuses on improving the locality of memory accesses and optimizing the use of shared caches by mapping threads that frequently communicate with each other to processing units that are closer to each other in the memory hierarchy. However, locality-based policies can lead to a substantial performance reduction in some cases due to communication imbalance. In this paper, we perform a comprehensive exploration of communicationaware thread mapping policies in multicore architectures. We develop a set of metrics to evaluate the communication behavior of parallel applications, and describe how these metrics can be used to favor locality-based or balance-based mapping policies. Based on these metrics, we introduce a novel mapping policy that combines locality and balance aspects and achieves the highest overall improvements. We provide an experimental evaluation of the performance gains using different mapping policies as well as a detailed analysis of the sources of energy savings.
UR - http://www.scopus.com/inward/record.url?scp=84944097203&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84944097203&partnerID=8YFLogxK
U2 - 10.1007/978-3-662-48096-0_16
DO - 10.1007/978-3-662-48096-0_16
M3 - Conference contribution
AN - SCOPUS:84944097203
SN - 9783662480953
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 196
EP - 208
BT - Euro-Par 2015
A2 - Traff, Jesper Larsson
A2 - Hunold, Sascha
A2 - Versaci, Francesco
PB - Springer
T2 - 21st International Conference on Parallel and Distributed Computing, Euro-Par 2015
Y2 - 24 August 2015 through 28 August 2015
ER -