Exploiting large system dynamics for designing simple data center schedulers

Yousi Zheng, Ness B. Shroff, R. Srikant, Prasun Sinha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The number and size of data centers has seen a rapid growth in the last few years. It is no longer uncommon to see large data centers with thousands or even tens of thousands of machines. Hence, it is critical to develop scalable scheduling mechanisms for processing the enormous number of jobs handled by popular paradigms such as the MapReduce framework. This work explores the possibility of simplifying the scheduling procedure by exploiting the 'largeness' of the data center system. Specifically, we consider the problem of minimizing the total flow time of a sequence of jobs under the MapReduce framework, where the jobs arrive over time and need to be processed through both Map and Reduce procedures before leaving the system. We show that any work-conserving scheduler is asymptotically optimal under a wide range of traffic loads, including the heavy traffic limit. Our results are shown for scenarios in which the tasks can be preempted and served in parallel over different machines, as well as scenarios when each task has to be served only on one machine and cannot be preempted. This result implies, somewhat surprisingly, that when we have a large number of machines, there is little to be gained by optimizing beyond ensuring that a scheduler should be work-conserving. For long running applications, we also study the relationship between the number of machines and total running time, and show sufficient conditions to guarantee the asymptotic optimality of work-conserving schedulers. Further, we run extensive simulations, that indeed verify that when the total number of machines is large, state-of-the-art work-conserving schedulers have similar and close-to-optimal delay performance.

Original languageEnglish (US)
Title of host publication2015 IEEE Conference on Computer Communications, IEEE INFOCOM 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages397-405
Number of pages9
ISBN (Electronic)9781479983810
DOIs
StatePublished - Aug 21 2015
Event34th IEEE Annual Conference on Computer Communications and Networks, IEEE INFOCOM 2015 - Hong Kong, Hong Kong
Duration: Apr 26 2015May 1 2015

Publication series

NameProceedings - IEEE INFOCOM
Volume26
ISSN (Print)0743-166X

Other

Other34th IEEE Annual Conference on Computer Communications and Networks, IEEE INFOCOM 2015
Country/TerritoryHong Kong
CityHong Kong
Period4/26/155/1/15

ASJC Scopus subject areas

  • Computer Science(all)
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Exploiting large system dynamics for designing simple data center schedulers'. Together they form a unique fingerprint.

Cite this