A dynamic replica selection algorithm for tolerating timing faults

Sudha Krishnamurthy, William H. Sanders, Michel Cukier

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Server replication is commonly used to improve the fault tolerance and response time of distributed services. An important problem when executing time-critical applications in a replicated environment is that of preventing timing failures by dynamically selecting the replicas that can satisfy, a client's timing requirement, even when the quality of service is degraded due to replica failures and excess load on the server In this paper, we describe the approach we have used to solve this problem in AQUA, a CORBA-based middleware that transparently replicates objects across a local area network. The approach we use estimates a replica's response time distribution based on performance measurements regularly broadcast by the replica. An online model uses these measurements to predict the probability with which a replica can prevent a timing failure for a client. A selection algorithm then uses this prediction to choose a subset of replicas that can together meet the client's timing constraints with at least the probability requested by the client. We conclude with experimental results based on our implementation.

Original languageEnglish (US)
Title of host publicationProceedings of the International Conference on Dependable Systems and Networks
EditorsD.C. Young, D.C. Young
Pages107-116
Number of pages10
DOIs
StatePublished - 2001
EventProceedings of the International Conference on Dependable Systems and Networks - Goteborg, Sweden
Duration: Jul 1 2001Jul 4 2001

Publication series

NameProceedings of the International Conference on Dependable Systems and Networks

Other

OtherProceedings of the International Conference on Dependable Systems and Networks
Country/TerritorySweden
CityGoteborg
Period7/1/017/4/01

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'A dynamic replica selection algorithm for tolerating timing faults'. Together they form a unique fingerprint.

Cite this