Breaking the MapReduce stage barrier

Abhishek Verma, Nicolas Zea, Brian Cho, Indranil Gupta, Roy H. Campbell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The MapReduce model uses a barrier between the Map and Reduce stages. This provides simplicity in both programming and implementation. However, in many situations, this barrier hurts performance because it is overly restrictive. Hence, we develop a method to break the barrier in MapReduce in a way that improves efficiency. Careful design of our barrier- less MapReduce framework results in equivalent generality and retains ease of programming. We motivate our case with, and experimentally study our barrier-less techniques in, a wide variety of MapReduce applications divided into seven classes. Our experiments show that our approach can achieve better performance times than a traditional MapReduce framework. We achieve a reduction in job completion times that is 25% on average and 87% in the best case.

Original languageEnglish (US)
Title of host publicationProceedings - 2010 IEEE International Conference on Cluster Computing, Cluster 2010
Pages235-244
Number of pages10
DOIs
StatePublished - Dec 2 2010

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
ISSN (Print)1552-5244

Fingerprint

Experiments

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Signal Processing

Cite this

Verma, A., Zea, N., Cho, B., Gupta, I., & Campbell, R. H. (2010). Breaking the MapReduce stage barrier. In Proceedings - 2010 IEEE International Conference on Cluster Computing, Cluster 2010 (pp. 235-244). [5600302] (Proceedings - IEEE International Conference on Cluster Computing, ICCC). https://doi.org/10.1109/CLUSTER.2010.29

Breaking the MapReduce stage barrier. / Verma, Abhishek; Zea, Nicolas; Cho, Brian; Gupta, Indranil; Campbell, Roy H.

Proceedings - 2010 IEEE International Conference on Cluster Computing, Cluster 2010. 2010. p. 235-244 5600302 (Proceedings - IEEE International Conference on Cluster Computing, ICCC).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Verma, A, Zea, N, Cho, B, Gupta, I & Campbell, RH 2010, Breaking the MapReduce stage barrier. in Proceedings - 2010 IEEE International Conference on Cluster Computing, Cluster 2010., 5600302, Proceedings - IEEE International Conference on Cluster Computing, ICCC, pp. 235-244. https://doi.org/10.1109/CLUSTER.2010.29
Verma A, Zea N, Cho B, Gupta I, Campbell RH. Breaking the MapReduce stage barrier. In Proceedings - 2010 IEEE International Conference on Cluster Computing, Cluster 2010. 2010. p. 235-244. 5600302. (Proceedings - IEEE International Conference on Cluster Computing, ICCC). https://doi.org/10.1109/CLUSTER.2010.29
Verma, Abhishek ; Zea, Nicolas ; Cho, Brian ; Gupta, Indranil ; Campbell, Roy H. / Breaking the MapReduce stage barrier. Proceedings - 2010 IEEE International Conference on Cluster Computing, Cluster 2010. 2010. pp. 235-244 (Proceedings - IEEE International Conference on Cluster Computing, ICCC).
@inproceedings{33a8a9cc83ad43a0a9ee0af6086068d2,
title = "Breaking the MapReduce stage barrier",
abstract = "The MapReduce model uses a barrier between the Map and Reduce stages. This provides simplicity in both programming and implementation. However, in many situations, this barrier hurts performance because it is overly restrictive. Hence, we develop a method to break the barrier in MapReduce in a way that improves efficiency. Careful design of our barrier- less MapReduce framework results in equivalent generality and retains ease of programming. We motivate our case with, and experimentally study our barrier-less techniques in, a wide variety of MapReduce applications divided into seven classes. Our experiments show that our approach can achieve better performance times than a traditional MapReduce framework. We achieve a reduction in job completion times that is 25{\%} on average and 87{\%} in the best case.",
author = "Abhishek Verma and Nicolas Zea and Brian Cho and Indranil Gupta and Campbell, {Roy H.}",
year = "2010",
month = "12",
day = "2",
doi = "10.1109/CLUSTER.2010.29",
language = "English (US)",
isbn = "9780769542201",
series = "Proceedings - IEEE International Conference on Cluster Computing, ICCC",
pages = "235--244",
booktitle = "Proceedings - 2010 IEEE International Conference on Cluster Computing, Cluster 2010",

}

TY - GEN

T1 - Breaking the MapReduce stage barrier

AU - Verma, Abhishek

AU - Zea, Nicolas

AU - Cho, Brian

AU - Gupta, Indranil

AU - Campbell, Roy H.

PY - 2010/12/2

Y1 - 2010/12/2

N2 - The MapReduce model uses a barrier between the Map and Reduce stages. This provides simplicity in both programming and implementation. However, in many situations, this barrier hurts performance because it is overly restrictive. Hence, we develop a method to break the barrier in MapReduce in a way that improves efficiency. Careful design of our barrier- less MapReduce framework results in equivalent generality and retains ease of programming. We motivate our case with, and experimentally study our barrier-less techniques in, a wide variety of MapReduce applications divided into seven classes. Our experiments show that our approach can achieve better performance times than a traditional MapReduce framework. We achieve a reduction in job completion times that is 25% on average and 87% in the best case.

AB - The MapReduce model uses a barrier between the Map and Reduce stages. This provides simplicity in both programming and implementation. However, in many situations, this barrier hurts performance because it is overly restrictive. Hence, we develop a method to break the barrier in MapReduce in a way that improves efficiency. Careful design of our barrier- less MapReduce framework results in equivalent generality and retains ease of programming. We motivate our case with, and experimentally study our barrier-less techniques in, a wide variety of MapReduce applications divided into seven classes. Our experiments show that our approach can achieve better performance times than a traditional MapReduce framework. We achieve a reduction in job completion times that is 25% on average and 87% in the best case.

UR - http://www.scopus.com/inward/record.url?scp=78649457559&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78649457559&partnerID=8YFLogxK

U2 - 10.1109/CLUSTER.2010.29

DO - 10.1109/CLUSTER.2010.29

M3 - Conference contribution

AN - SCOPUS:78649457559

SN - 9780769542201

T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC

SP - 235

EP - 244

BT - Proceedings - 2010 IEEE International Conference on Cluster Computing, Cluster 2010

ER -