Holistic measurement-driven system assessment

Saurabh Jha, Jim Brandt, Ann Gentile, Zbigniew T Kalbarczyk, Gregory H Bauer, Jeremy James Enos, Michael Showerman, Larry Kaplan, Brett Bode, Annette Greiner, Amanda Bonnie, Mike Mason, Ravishankar K Iyer, William T Kramer

Research output: Chapter in Book/Report/Conference proceedingConference contribution


In high-performance computing systems, application performance and throughput are dependent on a complex interplay of hardware and software subsystems and variable workloads with competing resource demands. Data-driven insights into the potentially widespread scope and propagationof impact of events, such as faults and contention for shared resources, can be used to drive more effective use of resources, for improved root cause diagnosis, and for predicting performance impacts. We present work developing integrated capabilities for holistic monitoring and analysis to understand and characterize propagation of performance-degrading events. These characterizations can be used to determine and invoke mitigating responses by system administrators, applications, and system software.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE International Conference on Cluster Computing, CLUSTER 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages4
ISBN (Electronic)9781538623268
StatePublished - Sep 22 2017
Event2017 IEEE International Conference on Cluster Computing, CLUSTER 2017 - Honolulu, United States
Duration: Sep 5 2017Sep 8 2017

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
ISSN (Print)1552-5244


Other2017 IEEE International Conference on Cluster Computing, CLUSTER 2017
CountryUnited States

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Signal Processing

Fingerprint Dive into the research topics of 'Holistic measurement-driven system assessment'. Together they form a unique fingerprint.

Cite this