Roll-forward and rollback recovery: Performance-reliability trade-off

Dhiraj K. Pradhan, Nitin H. Vaidya

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Performance and reliability achieved by a modular redundant system depend on the recovery scheme used. Typically, gain in performance using comparable resources results in reduced reliability. Several high-performance computers are noted for small mean time to failure. Performance is measured here in terms of mean and variance of the task completion time, reliability being a task-based measure defined as the probability that a task is completed correctly. Two roll-forward schemes are compared with two rollback schemes for achieving recovery in duplex systems. The roll-forward schemes discussed here are based on a roll-forward checkpointing concept proposed in [5-8]. Roll-forward recovery schemes achieve significantly better performance than rollback schemes by avoiding rollback in most common fault scenarios. It is shown that the roll-forward schemes improve performance with only a small loss in reliability as compared to rollback schemes.

Original languageEnglish (US)
Title of host publicationDigest of Papers - International Symposium on Fault-Tolerant Computing
PublisherPubl by IEEE
Pages186-195
Number of pages10
ISBN (Print)0818655224
StatePublished - Jan 1 1994
EventProceedings of the 24th International Symposium on Fault-Tolerant Computing - Austin, TX, USA
Duration: Jun 15 1994Jun 17 1994

Publication series

NameDigest of Papers - International Symposium on Fault-Tolerant Computing
ISSN (Print)0731-3071

Other

OtherProceedings of the 24th International Symposium on Fault-Tolerant Computing
CityAustin, TX, USA
Period6/15/946/17/94

ASJC Scopus subject areas

  • Hardware and Architecture
  • Engineering(all)

Fingerprint Dive into the research topics of 'Roll-forward and rollback recovery: Performance-reliability trade-off'. Together they form a unique fingerprint.

Cite this