MEASUREMENT-BASED MODEL FOR WORKLOAD DEPENDENCE OF CPU ERRORS.

Ravishankar K. Iyer, David J. Rossetti

Research output: Contribution to journalArticlepeer-review

Abstract

A methodology to measure explicitly the increase in the risk of a processor error with increasing workload is proposed. By relating the occurrence of a CPU-related error to the system activity just prior to the occurrence of an error, the approach measures the dynamic CPU workload/failure relationship. The measurements show that the probability of a CPU-related error (the load hazard) increases nonlinearly with increasing workload; i. e. , the CPU rapidly deteriorates as endpoints are reached. The load hazard is observed to be most sensitive to system CPU utilization, the I/O rate, and the interrupt rates. The results are significant because they indicate that it may not be useful to push a system close to its performance limits (the previously accepted operating goal), since what we gain in slightly improved performance is more than offset by the degradation in reliability. The results also indicate that conventional reliability models need to be reevaluated to take system workload explicitly into account.

Original languageEnglish (US)
Pages (from-to)511-519
Number of pages9
JournalIEEE Transactions on Computers
VolumeC-35
Issue number6
DOIs
StatePublished - 1986
Externally publishedYes

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'MEASUREMENT-BASED MODEL FOR WORKLOAD DEPENDENCE OF CPU ERRORS.'. Together they form a unique fingerprint.

Cite this