TY - GEN
T1 - Towards scalable performance analysis and visualization through data reduction
AU - Chee, Wai Lee
AU - Mendes, Celso
AU - Kalé, Laxmikant V.
PY - 2008
Y1 - 2008
N2 - Performance analysis tools based on event tracing are important for understanding the complex computational activities and communication patterns in high performance applications. The purpose of these tools is to help applications scale well to large numbers of processors. However, the tools themselves have to be scalable. As application problem sizes grow larger to exploit larger machines, the volume of performance trace data generated becomes unmanagable especially as we scale to tens of thousands of processors. Simultaneously, at analysis time, the amount of information that has to be presented to a human analyst can also become overwhelming. This paper investigates the effectiveness of employing heuristics and clustering techniques in a scalability framework to determine a subset of processors whose detailed event traces should be retained. It is a form of compression where we retain information from processors with high signal content. We quantify the reduction in the volume of performance trace data generated by NAMD, a molecular dynamics simulation application implemented using CHARM++. We show that, for the known performance problem of poor application grainsize, the quality of the trace data preserved by this approach is sufficient to highlight the problem.
AB - Performance analysis tools based on event tracing are important for understanding the complex computational activities and communication patterns in high performance applications. The purpose of these tools is to help applications scale well to large numbers of processors. However, the tools themselves have to be scalable. As application problem sizes grow larger to exploit larger machines, the volume of performance trace data generated becomes unmanagable especially as we scale to tens of thousands of processors. Simultaneously, at analysis time, the amount of information that has to be presented to a human analyst can also become overwhelming. This paper investigates the effectiveness of employing heuristics and clustering techniques in a scalability framework to determine a subset of processors whose detailed event traces should be retained. It is a form of compression where we retain information from processors with high signal content. We quantify the reduction in the volume of performance trace data generated by NAMD, a molecular dynamics simulation application implemented using CHARM++. We show that, for the known performance problem of poor application grainsize, the quality of the trace data preserved by this approach is sufficient to highlight the problem.
UR - http://www.scopus.com/inward/record.url?scp=51049087895&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51049087895&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2008.4536187
DO - 10.1109/IPDPS.2008.4536187
M3 - Conference contribution
AN - SCOPUS:51049087895
SN - 9781424416943
T3 - IPDPS Miami 2008 - Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium, Program and CD-ROM
BT - IPDPS Miami 2008 - Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium, Program and CD-ROM
T2 - IPDPS 2008 - 22nd IEEE International Parallel and Distributed Processing Symposium
Y2 - 14 April 2008 through 18 April 2008
ER -