TY - GEN
T1 - Scheduling with multi-level data locality
T2 - 35th Annual IEEE International Conference on Computer Communications, IEEE INFOCOM 2016
AU - Xie, Qiaomin
AU - Yekkehkhany, Ali
AU - Lu, Yi
N1 - Publisher Copyright:
© 2016 IEEE.
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2016/7/27
Y1 - 2016/7/27
N2 - A fundamental problem to all data-parallel applications is data locality. An example is map task scheduling in the MapReduce framework. Existing theoretical work analyzes systems with only two levels of locality, despite the existence of multiple locality levels within and across data centers. We found that going from two to three levels of locality changes the problem drastically, as a tradeoff between performance and throughput emerges. The recently proposed priority algorithm, which is throughput and heavy-traffic optimal for two locality levels, is not even throughput-optimal with three locality levels. The JSQ-MaxWeight algorithm proposed by Wang et al. is heavy-traffic optimal only for a special traffic scenario with two locality levels. We show that an extension of the JSQ-MaxWeight algorithm to three locality levels preserves its throughput-optimality, but suffers from the same lack of heavy-traffic optimality for most traffic scenarios. We propose a novel algorithm that uses Weighted-Workload (WW) routing and priority service. We establish its throughput and heavy-traffic optimality for all traffic scenarios. The main challenge is the construction of an appropriate ideal load decomposition that allows the separate treatment of different subsystems.
AB - A fundamental problem to all data-parallel applications is data locality. An example is map task scheduling in the MapReduce framework. Existing theoretical work analyzes systems with only two levels of locality, despite the existence of multiple locality levels within and across data centers. We found that going from two to three levels of locality changes the problem drastically, as a tradeoff between performance and throughput emerges. The recently proposed priority algorithm, which is throughput and heavy-traffic optimal for two locality levels, is not even throughput-optimal with three locality levels. The JSQ-MaxWeight algorithm proposed by Wang et al. is heavy-traffic optimal only for a special traffic scenario with two locality levels. We show that an extension of the JSQ-MaxWeight algorithm to three locality levels preserves its throughput-optimality, but suffers from the same lack of heavy-traffic optimality for most traffic scenarios. We propose a novel algorithm that uses Weighted-Workload (WW) routing and priority service. We establish its throughput and heavy-traffic optimality for all traffic scenarios. The main challenge is the construction of an appropriate ideal load decomposition that allows the separate treatment of different subsystems.
UR - http://www.scopus.com/inward/record.url?scp=84983353674&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84983353674&partnerID=8YFLogxK
U2 - 10.1109/INFOCOM.2016.7524416
DO - 10.1109/INFOCOM.2016.7524416
M3 - Conference contribution
AN - SCOPUS:84983353674
T3 - Proceedings - IEEE INFOCOM
BT - IEEE INFOCOM 2016 - 35th Annual IEEE International Conference on Computer Communications
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 10 April 2016 through 14 April 2016
ER -