TY - GEN
T1 - TABARNAC
T2 - 2nd Workshop on Visual Performance Analysis, VPA 2015
AU - Beniamine, David
AU - Diener, Matthias
AU - Huard, Guillaume
AU - Navaux, Philippe O.A.
N1 - Publisher Copyright:
Copyright 2015 ACM.
PY - 2015/11/15
Y1 - 2015/11/15
N2 - In modern parallel architectures, memory accesses represent a common bottleneck. Thus, optimizing the way applications access the memory is an important way to improve performance and energy consumption. Memory accesses are even more important with NUMA machines, as the access time to data depends on its location in the memory. Many efforts were made to develop adaptive tools to improve memory accesses at the runtime by optimizing the mapping of data and threads to NUMA nodes. However, theses tools are not able to change the memory access pattern of the original application, therefore a code written without considering memory performance might not benefit from them. Moreover, automatic mapping tools take time to converge towards the best mapping, losing optimization opportunities. A deeper understanding of the memory behavior can help optimizing it, removing the need for runtime analysis. In this paper, we present TABARNAC, a tool for analyzing the memory behavior of parallel applications with a focus on NUMA architectures. TABARNAC provides a new visualization of the memory access behavior, focusing on the distribution of accesses by thread and by structure. Such visualization allows the developer to easily understand why performance issues occur and how to fix them. Using TABARNAC, we explain why some applications do not benefit from data and thread mapping. Moreover, we propose several code modifications to improve the memory access behavior of several parallel applications.
AB - In modern parallel architectures, memory accesses represent a common bottleneck. Thus, optimizing the way applications access the memory is an important way to improve performance and energy consumption. Memory accesses are even more important with NUMA machines, as the access time to data depends on its location in the memory. Many efforts were made to develop adaptive tools to improve memory accesses at the runtime by optimizing the mapping of data and threads to NUMA nodes. However, theses tools are not able to change the memory access pattern of the original application, therefore a code written without considering memory performance might not benefit from them. Moreover, automatic mapping tools take time to converge towards the best mapping, losing optimization opportunities. A deeper understanding of the memory behavior can help optimizing it, removing the need for runtime analysis. In this paper, we present TABARNAC, a tool for analyzing the memory behavior of parallel applications with a focus on NUMA architectures. TABARNAC provides a new visualization of the memory access behavior, focusing on the distribution of accesses by thread and by structure. Such visualization allows the developer to easily understand why performance issues occur and how to fix them. Using TABARNAC, we explain why some applications do not benefit from data and thread mapping. Moreover, we propose several code modifications to improve the memory access behavior of several parallel applications.
UR - http://www.scopus.com/inward/record.url?scp=84960890935&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84960890935&partnerID=8YFLogxK
U2 - 10.1145/2835238.2835239
DO - 10.1145/2835238.2835239
M3 - Conference contribution
AN - SCOPUS:84960890935
T3 - Proceedings of VPA 2015: 2nd Workshop on Visual Performance Analysis - Held in conjunction with SC 2015: The International Conference for High Performance Computing, Networking, Storage and Analysis
BT - Proceedings of VPA 2015
PB - Association for Computing Machinery
Y2 - 20 November 2015
ER -