The Packing Server for real-time scheduling of MapReduce workflows

Shen Li, Shaohan Hu, Tarek Abdelzaher

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper develops new schedulability bounds for a simplified MapReduce workflow model. MapReduce is a distributed computing paradigm, deployed in industry for over a decade. Different from conventional multiprocessor platforms, MapReduce deployments usually span thousands of machines, and a MapReduce job may contain as many as tens of thousands of parallel segments. State-of-the-art MapReduce workflow schedulers operate in a best-effort fashion, but the need for real-time operation has grown with the emergence of real-time analytic applications. MapReduce workflow details can be captured by the generalized parallel task model from recent real-time literature. Under this model, the best-known result guarantees schedulability if the task set utilization stays below 50% of total capacity, and the deadline to critical path length ratio, which we call the stretch φ, surpasses 2. This paper improves this bound further by introducing a hierarchical scheduling scheme based on the novel notion of a Packing Server, inspired by servers for aperiodic tasks. The Packing Server consists of multiple periodically replenished budgets that can execute in parallel and that appear as independent tasks to the underlying scheduler. Hence, the original problem of scheduling MapReduce workflows reduces to that of scheduling independent tasks. We prove that the utilization bound for schedulability of MapReduce workflows is dependent on the underlying independent task scheduling policy, and β is a tunable parameter that controls the maximum individual budget utilization.

Original languageEnglish (US)
Title of host publicationProceedings - 21st IEEE Real Time and Embedded Technology and Applications Symposium, RTAS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages51-62
Number of pages12
ISBN (Electronic)9781479986033
DOIs
StatePublished - May 14 2015
Event21st IEEE Real Time and Embedded Technology and Applications Symposium, RTAS 2015 - Seattle, United States
Duration: Apr 13 2015Apr 16 2015

Publication series

NameProceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS
Volume2015-May
ISSN (Print)1545-3421

Other

Other21st IEEE Real Time and Embedded Technology and Applications Symposium, RTAS 2015
CountryUnited States
CitySeattle
Period4/13/154/16/15

    Fingerprint

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Li, S., Hu, S., & Abdelzaher, T. (2015). The Packing Server for real-time scheduling of MapReduce workflows. In Proceedings - 21st IEEE Real Time and Embedded Technology and Applications Symposium, RTAS 2015 (pp. 51-62). [7108416] (Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS; Vol. 2015-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/RTAS.2015.7108416