Cluster-based failure detection service for large-scale ad hoc wireless network applications

Ann T. Tai, Kam S. Tso, William H Sanders

Research output: Contribution to conferencePaper

Abstract

The growing interest in ad hoc wireless network applications that are made of large and dense populations of lightweight system resources calls for scalable approaches to fault tolerance. Moreover, the nature of these systems creates significant challenges for the development of failure detection services (FDSs), because their quality often depends heavily on reliable communication. In particular, ad hoc wireless networks are notoriously vulnerable to message loss, which precludes deterministic guarantees for the completeness and accuracy properties of FDSs. To meet the challenges, we propose an FDS based on the notion of clustering. Specifically, we use a cluster-based communication architecture to permit the FDS to be implemented in a distributed manner via infra-cluster heartbeat diffusion and to allow a failure report to be forwarded across clusters through the upper layer of the communication hierarchy. In doing so, we extensively exploit the message redundancy that is inherent in ad hoc wireless settings to mitigate the effects of message loss on the accuracy and completeness properties of failure detection. As shown by our mathematical analysis, the resulting FDS is able to provide satisfactory probabilistic guarantees for the desired properties.

Original languageEnglish (US)
Pages805-814
Number of pages10
StatePublished - Oct 1 2004
Event2004 International Conference on Dependable Systems and Networks - Florence, Italy
Duration: Jun 28 2004Jul 1 2004

Other

Other2004 International Conference on Dependable Systems and Networks
CountryItaly
CityFlorence
Period6/28/047/1/04

Fingerprint

Wireless ad hoc networks
Communication
Fault tolerance
Redundancy

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Tai, A. T., Tso, K. S., & Sanders, W. H. (2004). Cluster-based failure detection service for large-scale ad hoc wireless network applications. 805-814. Paper presented at 2004 International Conference on Dependable Systems and Networks, Florence, Italy.

Cluster-based failure detection service for large-scale ad hoc wireless network applications. / Tai, Ann T.; Tso, Kam S.; Sanders, William H.

2004. 805-814 Paper presented at 2004 International Conference on Dependable Systems and Networks, Florence, Italy.

Research output: Contribution to conferencePaper

Tai, AT, Tso, KS & Sanders, WH 2004, 'Cluster-based failure detection service for large-scale ad hoc wireless network applications', Paper presented at 2004 International Conference on Dependable Systems and Networks, Florence, Italy, 6/28/04 - 7/1/04 pp. 805-814.
Tai AT, Tso KS, Sanders WH. Cluster-based failure detection service for large-scale ad hoc wireless network applications. 2004. Paper presented at 2004 International Conference on Dependable Systems and Networks, Florence, Italy.
Tai, Ann T. ; Tso, Kam S. ; Sanders, William H. / Cluster-based failure detection service for large-scale ad hoc wireless network applications. Paper presented at 2004 International Conference on Dependable Systems and Networks, Florence, Italy.10 p.
@conference{efec6fb7901c49ad9f2938ebd229e87b,
title = "Cluster-based failure detection service for large-scale ad hoc wireless network applications",
abstract = "The growing interest in ad hoc wireless network applications that are made of large and dense populations of lightweight system resources calls for scalable approaches to fault tolerance. Moreover, the nature of these systems creates significant challenges for the development of failure detection services (FDSs), because their quality often depends heavily on reliable communication. In particular, ad hoc wireless networks are notoriously vulnerable to message loss, which precludes deterministic guarantees for the completeness and accuracy properties of FDSs. To meet the challenges, we propose an FDS based on the notion of clustering. Specifically, we use a cluster-based communication architecture to permit the FDS to be implemented in a distributed manner via infra-cluster heartbeat diffusion and to allow a failure report to be forwarded across clusters through the upper layer of the communication hierarchy. In doing so, we extensively exploit the message redundancy that is inherent in ad hoc wireless settings to mitigate the effects of message loss on the accuracy and completeness properties of failure detection. As shown by our mathematical analysis, the resulting FDS is able to provide satisfactory probabilistic guarantees for the desired properties.",
author = "Tai, {Ann T.} and Tso, {Kam S.} and Sanders, {William H}",
year = "2004",
month = "10",
day = "1",
language = "English (US)",
pages = "805--814",
note = "2004 International Conference on Dependable Systems and Networks ; Conference date: 28-06-2004 Through 01-07-2004",

}

TY - CONF

T1 - Cluster-based failure detection service for large-scale ad hoc wireless network applications

AU - Tai, Ann T.

AU - Tso, Kam S.

AU - Sanders, William H

PY - 2004/10/1

Y1 - 2004/10/1

N2 - The growing interest in ad hoc wireless network applications that are made of large and dense populations of lightweight system resources calls for scalable approaches to fault tolerance. Moreover, the nature of these systems creates significant challenges for the development of failure detection services (FDSs), because their quality often depends heavily on reliable communication. In particular, ad hoc wireless networks are notoriously vulnerable to message loss, which precludes deterministic guarantees for the completeness and accuracy properties of FDSs. To meet the challenges, we propose an FDS based on the notion of clustering. Specifically, we use a cluster-based communication architecture to permit the FDS to be implemented in a distributed manner via infra-cluster heartbeat diffusion and to allow a failure report to be forwarded across clusters through the upper layer of the communication hierarchy. In doing so, we extensively exploit the message redundancy that is inherent in ad hoc wireless settings to mitigate the effects of message loss on the accuracy and completeness properties of failure detection. As shown by our mathematical analysis, the resulting FDS is able to provide satisfactory probabilistic guarantees for the desired properties.

AB - The growing interest in ad hoc wireless network applications that are made of large and dense populations of lightweight system resources calls for scalable approaches to fault tolerance. Moreover, the nature of these systems creates significant challenges for the development of failure detection services (FDSs), because their quality often depends heavily on reliable communication. In particular, ad hoc wireless networks are notoriously vulnerable to message loss, which precludes deterministic guarantees for the completeness and accuracy properties of FDSs. To meet the challenges, we propose an FDS based on the notion of clustering. Specifically, we use a cluster-based communication architecture to permit the FDS to be implemented in a distributed manner via infra-cluster heartbeat diffusion and to allow a failure report to be forwarded across clusters through the upper layer of the communication hierarchy. In doing so, we extensively exploit the message redundancy that is inherent in ad hoc wireless settings to mitigate the effects of message loss on the accuracy and completeness properties of failure detection. As shown by our mathematical analysis, the resulting FDS is able to provide satisfactory probabilistic guarantees for the desired properties.

UR - http://www.scopus.com/inward/record.url?scp=4544273060&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4544273060&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:4544273060

SP - 805

EP - 814

ER -