BayesPerf: Minimizing performance monitoring errors using Bayesian statistics

Subho S. Banerjee, Saurabh Jha, Zbigniew Kalbarczyk, Ravishankar K. Iyer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Hardware performance counters (HPCs) that measure low-level architectural and microarchitectural events provide dynamic contextual information about the state of the system. However, HPC measurements are error-prone due to non determinism (e.g., undercounting due to event multiplexing, or OS interrupt-handling behaviors). In this paper, we present BayesPerf, a system for quantifying uncertainty in HPC measurements by using a domain-driven Bayesian model that captures microarchitectural relationships between HPCs to jointly infer their values as probability distributions. We provide the design and implementation of an accelerator that allows for low-latency and low-power inference of the BayesPerf model for x86 and ppc64 CPUs. BayesPerf reduces the average error in HPC measurements from 40.1% to 7.6% when events are being multiplexed. The value of BayesPerf in real-time decision-making is illustrated with a simple example of scheduling of PCIe transfers.

Original languageEnglish (US)
Title of host publicationProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021
PublisherAssociation for Computing Machinery
Pages832-844
Number of pages13
ISBN (Electronic)9781450383172
DOIs
StatePublished - Apr 19 2021
Event26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021 - Virtual, Online, United States
Duration: Apr 19 2021Apr 23 2021

Publication series

NameInternational Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS

Conference

Conference26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021
CountryUnited States
CityVirtual, Online
Period4/19/214/23/21

Keywords

  • Accelerator
  • Error Correction
  • Error Detection
  • Performance Counter
  • Probabilistic Graphical Model
  • Sampling Errors

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture

Fingerprint Dive into the research topics of 'BayesPerf: Minimizing performance monitoring errors using Bayesian statistics'. Together they form a unique fingerprint.

Cite this