Reliable on-demand management operations for large-scale distributed applications

Research output: Contribution to journalConference article

Abstract

This paper argues for attention to, and proposes a novel direction to solving, instant monitoring and management tasks for large-scale distributed applications running across hundreds of hosts. We present the MON (Management Overlay Networks) approach, which uses a novel concept called on-demand overlays, in order to support instant commands such as queries and software pushes. On-demand overlays are built on-the-fly and probabilistically, by leveraging weakly-consistent gossip-style membership information underneath. Thus, they are lightweight in terms of memory, computation, and bandwidth. We augment on-demand overlays with several notions of application-specified reliability, and show how MON detects and adheres to these. MON is available atop PlanetLab, and we present experimental results. We conclude with a series of promising open problems in this direction.

Original languageEnglish (US)
Pages (from-to)82-88
Number of pages7
JournalOperating Systems Review (ACM)
Volume41
Issue number5
DOIs
StatePublished - Oct 1 2007
EventGossip-Based Computer Networking - Leiden, Netherlands
Duration: Dec 1 2006Dec 1 2006

Fingerprint

Overlay networks
Bandwidth
Data storage equipment
Monitoring

Keywords

  • Instant commands
  • Monitoring
  • On-demand overlays
  • Reliability

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Reliable on-demand management operations for large-scale distributed applications. / Liang, Jin; Gupta, Indranil; Nahrstedt, Klara.

In: Operating Systems Review (ACM), Vol. 41, No. 5, 01.10.2007, p. 82-88.

Research output: Contribution to journalConference article

@article{0ef9b725d5e848a7897d27473e92dbbe,
title = "Reliable on-demand management operations for large-scale distributed applications",
abstract = "This paper argues for attention to, and proposes a novel direction to solving, instant monitoring and management tasks for large-scale distributed applications running across hundreds of hosts. We present the MON (Management Overlay Networks) approach, which uses a novel concept called on-demand overlays, in order to support instant commands such as queries and software pushes. On-demand overlays are built on-the-fly and probabilistically, by leveraging weakly-consistent gossip-style membership information underneath. Thus, they are lightweight in terms of memory, computation, and bandwidth. We augment on-demand overlays with several notions of application-specified reliability, and show how MON detects and adheres to these. MON is available atop PlanetLab, and we present experimental results. We conclude with a series of promising open problems in this direction.",
keywords = "Instant commands, Monitoring, On-demand overlays, Reliability",
author = "Jin Liang and Indranil Gupta and Klara Nahrstedt",
year = "2007",
month = "10",
day = "1",
doi = "10.1145/1317379.1317392",
language = "English (US)",
volume = "41",
pages = "82--88",
journal = "Operating Systems Review (ACM)",
issn = "0163-5980",
publisher = "Association for Computing Machinery (ACM)",
number = "5",

}

TY - JOUR

T1 - Reliable on-demand management operations for large-scale distributed applications

AU - Liang, Jin

AU - Gupta, Indranil

AU - Nahrstedt, Klara

PY - 2007/10/1

Y1 - 2007/10/1

N2 - This paper argues for attention to, and proposes a novel direction to solving, instant monitoring and management tasks for large-scale distributed applications running across hundreds of hosts. We present the MON (Management Overlay Networks) approach, which uses a novel concept called on-demand overlays, in order to support instant commands such as queries and software pushes. On-demand overlays are built on-the-fly and probabilistically, by leveraging weakly-consistent gossip-style membership information underneath. Thus, they are lightweight in terms of memory, computation, and bandwidth. We augment on-demand overlays with several notions of application-specified reliability, and show how MON detects and adheres to these. MON is available atop PlanetLab, and we present experimental results. We conclude with a series of promising open problems in this direction.

AB - This paper argues for attention to, and proposes a novel direction to solving, instant monitoring and management tasks for large-scale distributed applications running across hundreds of hosts. We present the MON (Management Overlay Networks) approach, which uses a novel concept called on-demand overlays, in order to support instant commands such as queries and software pushes. On-demand overlays are built on-the-fly and probabilistically, by leveraging weakly-consistent gossip-style membership information underneath. Thus, they are lightweight in terms of memory, computation, and bandwidth. We augment on-demand overlays with several notions of application-specified reliability, and show how MON detects and adheres to these. MON is available atop PlanetLab, and we present experimental results. We conclude with a series of promising open problems in this direction.

KW - Instant commands

KW - Monitoring

KW - On-demand overlays

KW - Reliability

UR - http://www.scopus.com/inward/record.url?scp=62749137379&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=62749137379&partnerID=8YFLogxK

U2 - 10.1145/1317379.1317392

DO - 10.1145/1317379.1317392

M3 - Conference article

AN - SCOPUS:62749137379

VL - 41

SP - 82

EP - 88

JO - Operating Systems Review (ACM)

JF - Operating Systems Review (ACM)

SN - 0163-5980

IS - 5

ER -