Predicting application performance using supervised learning on communication features

Nikhil Jain, Abhinav Bhatele, Michael P. Robson, Todd Gamblin, Laxmikant V. Kale

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Task mapping on torus networks has traditionally focused on either reducing the maximum dilation or average number of hops per byte for messages in an application. These metrics make simplified assumptions about the cause of network congestion, and do not provide accurate correlation with execution time. Hence, these metrics cannot be used to reasonably predict or compare application performance for different mappings. In this paper, we attempt to model the performance of an application using communication data, such as the communication graph and network hardware counters. We use supervised learning algorithms, such as randomized decision trees, to correlate performance with prior and new metrics. We propose new hybrid metrics that provide high correlation with application performance, and may be useful for accurate performance prediction. For three different communication patterns and a production application, we demonstrate a very strong correlation between the proposed metrics and the execution time of these codes.

Original languageEnglish (US)
Title of host publicationProceedings of SC 2013
Subtitle of host publicationThe International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherIEEE Computer Society
ISBN (Print)9781450323789
DOIs
StatePublished - 2013
Event2013 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013 - Denver, CO, United States
Duration: Nov 17 2013Nov 22 2013

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Other

Other2013 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013
Country/TerritoryUnited States
CityDenver, CO
Period11/17/1311/22/13

Keywords

  • Contention
  • Modeling
  • Prediction
  • Supervised learning
  • Task mapping
  • Torus networks

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'Predicting application performance using supervised learning on communication features'. Together they form a unique fingerprint.

Cite this