TY - GEN
T1 - ARIA
T2 - 8th ACM International Conference on Autonomic Computing, ICAC 2011 and Co-located Workshops
AU - Verma, Abhishek
AU - Cherkasova, Ludmila
AU - Campbell, Roy H.
PY - 2011
Y1 - 2011
N2 - MapReduce and Hadoop represent an economically compelling alternative for efficient large scale data processing and advanced analytics in the enterprise. A key challenge in shared MapReduce clusters is the ability to automatically tailor and control resource allocations to different applications for achieving their performance goals. Currently, there is no job scheduler for MapReduce environments that given a job completion deadline, could allocate the appropriate amount of resources to the job so that it meets the required Service Level Objective (SLO). In this work, we propose a framework, called ARIA, to address this problem. It comprises of three inter-related components. First, for a production job that is routinely executed on a new dataset, we build a job profile that compactly summarizes critical performance characteristics of the underlying application during the map and reduce stages. Second, we design a MapReduce performance model, that for a given job (with a known profile) and its SLO (soft deadline), estimates the amount of resources required for job completion within the deadline. Finally, we implement a novel SLO-based scheduler in Hadoop that determines job ordering and the amount of resources to allocate for meeting the job deadlines. We validate our approach using a set of realistic applications. The new scheduler effectively meets the jobs' SLOs until the job demands exceed the cluster resources. The results of the extensive simulation study are validated through detailed experiments on a 66-node Hadoop cluster.
AB - MapReduce and Hadoop represent an economically compelling alternative for efficient large scale data processing and advanced analytics in the enterprise. A key challenge in shared MapReduce clusters is the ability to automatically tailor and control resource allocations to different applications for achieving their performance goals. Currently, there is no job scheduler for MapReduce environments that given a job completion deadline, could allocate the appropriate amount of resources to the job so that it meets the required Service Level Objective (SLO). In this work, we propose a framework, called ARIA, to address this problem. It comprises of three inter-related components. First, for a production job that is routinely executed on a new dataset, we build a job profile that compactly summarizes critical performance characteristics of the underlying application during the map and reduce stages. Second, we design a MapReduce performance model, that for a given job (with a known profile) and its SLO (soft deadline), estimates the amount of resources required for job completion within the deadline. Finally, we implement a novel SLO-based scheduler in Hadoop that determines job ordering and the amount of resources to allocate for meeting the job deadlines. We validate our approach using a set of realistic applications. The new scheduler effectively meets the jobs' SLOs until the job demands exceed the cluster resources. The results of the extensive simulation study are validated through detailed experiments on a 66-node Hadoop cluster.
KW - mapreduce
KW - modeling
KW - resource allocation
KW - scheduling
UR - http://www.scopus.com/inward/record.url?scp=79960196705&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79960196705&partnerID=8YFLogxK
U2 - 10.1145/1998582.1998637
DO - 10.1145/1998582.1998637
M3 - Conference contribution
AN - SCOPUS:79960196705
SN - 9781450306072
T3 - Proceedings of the 8th ACM International Conference on Autonomic Computing, ICAC 2011 and Co-located Workshops
SP - 235
EP - 244
BT - Proceedings of the 8th ACM International Conference on Autonomic Computing, ICAC 2011 and Co-located Workshops
Y2 - 14 June 2011 through 18 June 2011
ER -