Deadline-based workload management for MapReduce environments: Pieces of the performance puzzle

Abhishek Verma, Ludmila Cherkasova, Vijay S. Kumar, Roy H. Campbell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Hadoop and the associated MapReduce paradigm, has become the de facto platform for cost-effective analytics over "Big Data". There is an increasing number of MapReduce applications associated with live business intelligence that require completion time guarantees. In this work, we introduce and analyze a set of complementary mechanisms that enhance workload management decisions for processing MapReduce jobs with deadlines. The three mechanisms we consider are the following: 1) a policy for job ordering in the processing queue; 2) a mechanism for allocating a tailored number of map and reduce slots to each job with a completion time requirement; 3) a mechanism for allocating and deallocating (if necessary) spare resources in the system among the active jobs. We analyze the functionality and performance benefits of each mechanism via an extensive set of simulations over diverse workload sets. The proposed mechanisms form the integral pieces in the performance puzzle of automated workload management in MapReduce environments.

Original languageEnglish (US)
Title of host publicationProceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012
Pages900-905
Number of pages6
DOIs
StatePublished - Jul 30 2012
Event2012 IEEE Network Operations and Management Symposium, NOMS 2012 - Maui, HI, United States
Duration: Apr 16 2012Apr 20 2012

Publication series

NameProceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012

Other

Other2012 IEEE Network Operations and Management Symposium, NOMS 2012
CountryUnited States
CityMaui, HI
Period4/16/124/20/12

Fingerprint

MapReduce
Deadline
Workload
Management decisions
Hadoop
Functionality
Resources
Guarantee
Simulation
Paradigm
Business intelligence
Queue
Integral

Keywords

  • MapReduce
  • Performance
  • Resource Allocation

ASJC Scopus subject areas

  • Management Science and Operations Research

Cite this

Verma, A., Cherkasova, L., Kumar, V. S., & Campbell, R. H. (2012). Deadline-based workload management for MapReduce environments: Pieces of the performance puzzle. In Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012 (pp. 900-905). [6212006] (Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012). https://doi.org/10.1109/NOMS.2012.6212006

Deadline-based workload management for MapReduce environments : Pieces of the performance puzzle. / Verma, Abhishek; Cherkasova, Ludmila; Kumar, Vijay S.; Campbell, Roy H.

Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012. 2012. p. 900-905 6212006 (Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Verma, A, Cherkasova, L, Kumar, VS & Campbell, RH 2012, Deadline-based workload management for MapReduce environments: Pieces of the performance puzzle. in Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012., 6212006, Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012, pp. 900-905, 2012 IEEE Network Operations and Management Symposium, NOMS 2012, Maui, HI, United States, 4/16/12. https://doi.org/10.1109/NOMS.2012.6212006
Verma A, Cherkasova L, Kumar VS, Campbell RH. Deadline-based workload management for MapReduce environments: Pieces of the performance puzzle. In Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012. 2012. p. 900-905. 6212006. (Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012). https://doi.org/10.1109/NOMS.2012.6212006
Verma, Abhishek ; Cherkasova, Ludmila ; Kumar, Vijay S. ; Campbell, Roy H. / Deadline-based workload management for MapReduce environments : Pieces of the performance puzzle. Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012. 2012. pp. 900-905 (Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012).
@inproceedings{fac56f24ff13451ea5d09f90d7d0498e,
title = "Deadline-based workload management for MapReduce environments: Pieces of the performance puzzle",
abstract = "Hadoop and the associated MapReduce paradigm, has become the de facto platform for cost-effective analytics over {"}Big Data{"}. There is an increasing number of MapReduce applications associated with live business intelligence that require completion time guarantees. In this work, we introduce and analyze a set of complementary mechanisms that enhance workload management decisions for processing MapReduce jobs with deadlines. The three mechanisms we consider are the following: 1) a policy for job ordering in the processing queue; 2) a mechanism for allocating a tailored number of map and reduce slots to each job with a completion time requirement; 3) a mechanism for allocating and deallocating (if necessary) spare resources in the system among the active jobs. We analyze the functionality and performance benefits of each mechanism via an extensive set of simulations over diverse workload sets. The proposed mechanisms form the integral pieces in the performance puzzle of automated workload management in MapReduce environments.",
keywords = "MapReduce, Performance, Resource Allocation",
author = "Abhishek Verma and Ludmila Cherkasova and Kumar, {Vijay S.} and Campbell, {Roy H.}",
year = "2012",
month = "7",
day = "30",
doi = "10.1109/NOMS.2012.6212006",
language = "English (US)",
isbn = "9781467302685",
series = "Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012",
pages = "900--905",
booktitle = "Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012",

}

TY - GEN

T1 - Deadline-based workload management for MapReduce environments

T2 - Pieces of the performance puzzle

AU - Verma, Abhishek

AU - Cherkasova, Ludmila

AU - Kumar, Vijay S.

AU - Campbell, Roy H.

PY - 2012/7/30

Y1 - 2012/7/30

N2 - Hadoop and the associated MapReduce paradigm, has become the de facto platform for cost-effective analytics over "Big Data". There is an increasing number of MapReduce applications associated with live business intelligence that require completion time guarantees. In this work, we introduce and analyze a set of complementary mechanisms that enhance workload management decisions for processing MapReduce jobs with deadlines. The three mechanisms we consider are the following: 1) a policy for job ordering in the processing queue; 2) a mechanism for allocating a tailored number of map and reduce slots to each job with a completion time requirement; 3) a mechanism for allocating and deallocating (if necessary) spare resources in the system among the active jobs. We analyze the functionality and performance benefits of each mechanism via an extensive set of simulations over diverse workload sets. The proposed mechanisms form the integral pieces in the performance puzzle of automated workload management in MapReduce environments.

AB - Hadoop and the associated MapReduce paradigm, has become the de facto platform for cost-effective analytics over "Big Data". There is an increasing number of MapReduce applications associated with live business intelligence that require completion time guarantees. In this work, we introduce and analyze a set of complementary mechanisms that enhance workload management decisions for processing MapReduce jobs with deadlines. The three mechanisms we consider are the following: 1) a policy for job ordering in the processing queue; 2) a mechanism for allocating a tailored number of map and reduce slots to each job with a completion time requirement; 3) a mechanism for allocating and deallocating (if necessary) spare resources in the system among the active jobs. We analyze the functionality and performance benefits of each mechanism via an extensive set of simulations over diverse workload sets. The proposed mechanisms form the integral pieces in the performance puzzle of automated workload management in MapReduce environments.

KW - MapReduce

KW - Performance

KW - Resource Allocation

UR - http://www.scopus.com/inward/record.url?scp=84864188522&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84864188522&partnerID=8YFLogxK

U2 - 10.1109/NOMS.2012.6212006

DO - 10.1109/NOMS.2012.6212006

M3 - Conference contribution

AN - SCOPUS:84864188522

SN - 9781467302685

T3 - Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012

SP - 900

EP - 905

BT - Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012

ER -