TY - GEN
T1 - Experimental evaluation of the unavailability induced by a group membership protocol
AU - Joshi, Kaustubh R.
AU - Cukier, Michel
AU - Sanders, William H.
PY - 2002/1/1
Y1 - 2002/1/1
N2 - Group communication is an important paradigm for building highly available distributed systems. However, group membership operations often require the system to block message traffic, causing system services to become unavailable. This makes it important to quantify the unavailability induced by membership operations. This paper experimentally evaluates the blocking behavior of the group membership protocol of the Ensemble group communication system using a novel global-state-based fault injection technique. In doing so, we demonstrate how a layered distributed protocol such as the Ensemble group membership protocol can be modeled in terms of a state machine abstraction, and show how the resulting global state space can be used to specify fault triggers and define important measures on the system. Using this approach, we evaluate the cost associated with important states of the protocol under varying workload and group size. We also evaluate the sensitivity of the protocol to the occurrence of a second correlated crash failure during its operation.
AB - Group communication is an important paradigm for building highly available distributed systems. However, group membership operations often require the system to block message traffic, causing system services to become unavailable. This makes it important to quantify the unavailability induced by membership operations. This paper experimentally evaluates the blocking behavior of the group membership protocol of the Ensemble group communication system using a novel global-state-based fault injection technique. In doing so, we demonstrate how a layered distributed protocol such as the Ensemble group membership protocol can be modeled in terms of a state machine abstraction, and show how the resulting global state space can be used to specify fault triggers and define important measures on the system. Using this approach, we evaluate the cost associated with important states of the protocol under varying workload and group size. We also evaluate the sensitivity of the protocol to the occurrence of a second correlated crash failure during its operation.
UR - http://www.scopus.com/inward/record.url?scp=84937547021&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84937547021&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84937547021
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 140
EP - 158
BT - Dependable Computing - EDCC-4 - 4th European Dependable Computing Conference, Proceedings
A2 - Bondavalli, Andrea
A2 - Thevenod-Fosse, Pascale
PB - Springer
T2 - 4th European Dependable Computing Conference, EDCC 2002
Y2 - 23 October 2002 through 25 October 2002
ER -