A parsimonious approach for obtaining resource-efficient and trustworthy execution

Hari Govind V. Ramasamy, Adnan Agbaria, William H. Sanders

Research output: Contribution to journalArticlepeer-review

Abstract

We propose a resource-efficient way to execute requests in Byzantine-fault-tolerant replication that is particularly well suited for services in which request processing is resource-intensive. Previous efforts took a failure masking all-active approach of using all execution replicas to execute all requests; at least 2t + 1 execution replicas are needed to mask t Byzantine-faulty ones. We describe an asynchronous protocol that provides resource-efficient execution by combining failure masking with imperfect failure detection and checkpointing. Our protocol is parsimonious since it uses only ℓ + 1 execution replicas, called the primary committee or PC, to execute the requests under normal conditions characterized by a stable network and no misbehavior by PC replicas; thus, a trustworthy reply can be obtained with the same latency, but with only about half of the overall resource use of the all-active approach. However, a request that exposes faults among the PC replicas will cause the protocol to switch to a recovery mode, in which all 2t + 1 replicas execute the request and send their replies; then, after selecting a new PC, the protocol switches back to parsimonious execution. Such a request will incur a higher latency using our approach than the all-active approach, mainly because of fault detection latency. Practical observations point to the fact that failures and instability are the exception rather than the norm. That motivated our decision to optimize resource efficiency for the common case, even if it means paying a slightly higher performance cost during periods of instability.

Original languageEnglish (US)
Pages (from-to)1-17
Number of pages17
JournalIEEE Transactions on Dependable and Secure Computing
Volume4
Issue number1
DOIs
StatePublished - Jan 1 2007

Keywords

  • Byzantine faults
  • Distributed systems
  • Fault tolerance

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'A parsimonious approach for obtaining resource-efficient and trustworthy execution'. Together they form a unique fingerprint.

Cite this