Jarvis: Large-scale Server Monitoring with Adaptive Near-data Processing

Atul Sandur, Chan Ho Park, Stavros Volos, Gul Agha, Myeongjae Jeon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Rapid detection and mitigation of issues that impact performance and reliability are paramount for large-scale online services. For real-time detection of such issues, datacenter operators use a stream processor and analyze streams of monitoring data collected from servers (referred to as data source nodes) and their hosted services. The timely processing of incoming streams requires the network to transfer massive amounts of data, and significant compute resources to process it. These factors often create bottlenecks for stream analytics. To help overcome these bottlenecks, current monitoring systems employ near-data processing by either computing an optimal query partition based on a cost model or using model-agnostic heuristics. Optimal partitioning is computationally expensive, while model-agnostic heuristics are iterative and search over a large solution space. We combine these approaches by using model-agnostic heuristics to improve the partitioning solution from a model-based heuristic. Moreover, current systems use operator-level partitioning: if a data source does not have sufficient resources to execute an operator on all records, the operator is executed only on the stream processor. Instead, we perform data-level partitioning - i.e., we allow an operator to be executed both on a stream processor and data sources. We implement our algorithm in a system called Jarvis, which enables quick adaptation to dynamic resource conditions. Our evaluation on a diverse set of monitoring workloads suggests that Jarvis converges to a stable query partition within seconds of a change in node resource conditions. Compared to current partitioning strategies, Jarvis handles up to 75% more data sources while improving throughput in resource-constrained scenarios by 1.2-4.4×.

Original languageEnglish (US)
Title of host publicationProceedings - 2022 IEEE 38th International Conference on Data Engineering, ICDE 2022
PublisherIEEE Computer Society
Pages1408-1422
Number of pages15
ISBN (Electronic)9781665408837
DOIs
StatePublished - 2022
Event38th IEEE International Conference on Data Engineering, ICDE 2022 - Virtual, Online, Malaysia
Duration: May 9 2022May 12 2022

Publication series

NameProceedings - International Conference on Data Engineering
Volume2022-May
ISSN (Print)1084-4627

Conference

Conference38th IEEE International Conference on Data Engineering, ICDE 2022
Country/TerritoryMalaysia
CityVirtual, Online
Period5/9/225/12/22

Keywords

  • analytics
  • edge analytics
  • near-data
  • query partitioning
  • query refinement
  • server monitoring
  • stream processing

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'Jarvis: Large-scale Server Monitoring with Adaptive Near-data Processing'. Together they form a unique fingerprint.

Cite this