TY - GEN
T1 - Priority algorithm for near-data scheduling
T2 - 34th IEEE Annual Conference on Computer Communications and Networks, IEEE INFOCOM 2015
AU - Xie, Qiaomin
AU - Lu, Yi
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/8/21
Y1 - 2015/8/21
N2 - The prevalence of data-parallel applications has made near-data scheduling an important problem. An example is the map task scheduling in the map-reduce framework. Wang et. al. [13] was the first to identify its capacity region and proposed a throughput-optimal algorithm based on MaxWeight. However, the study of the algorithm's delay performance revealed that it is only heavy-traffic optimal for a very special traffic scenario, where all traffic concentrates on a subset of servers. We propose a simple 'local-tasks first' priority algorithm and show that it is throughput-optimal and heavy-traffic optimal for all traffic scenarios, i.e., it asymptotically minimizes the average delay as the arrival rate vector approaches the boundary of the capacity region. So far, it is the only known heavy-traffic optimal algorithm for this setting. As the algorithm is based on pre-determined priority, a direct application of the Lyapunov drift technique does not work. The main proof ideas are the construction of an ideal load decomposition and the separate treatment of two subsystems based on their ideal load. To the best of our knowledge, this is the only setup of affinity scheduling where a simple priority algorithm is shown to be heavy-traffic optimal. Simulation shows that our algorithm also significantly outperforms existing algorithms at loads away from the boundary of the capacity region.
AB - The prevalence of data-parallel applications has made near-data scheduling an important problem. An example is the map task scheduling in the map-reduce framework. Wang et. al. [13] was the first to identify its capacity region and proposed a throughput-optimal algorithm based on MaxWeight. However, the study of the algorithm's delay performance revealed that it is only heavy-traffic optimal for a very special traffic scenario, where all traffic concentrates on a subset of servers. We propose a simple 'local-tasks first' priority algorithm and show that it is throughput-optimal and heavy-traffic optimal for all traffic scenarios, i.e., it asymptotically minimizes the average delay as the arrival rate vector approaches the boundary of the capacity region. So far, it is the only known heavy-traffic optimal algorithm for this setting. As the algorithm is based on pre-determined priority, a direct application of the Lyapunov drift technique does not work. The main proof ideas are the construction of an ideal load decomposition and the separate treatment of two subsystems based on their ideal load. To the best of our knowledge, this is the only setup of affinity scheduling where a simple priority algorithm is shown to be heavy-traffic optimal. Simulation shows that our algorithm also significantly outperforms existing algorithms at loads away from the boundary of the capacity region.
UR - http://www.scopus.com/inward/record.url?scp=84954231224&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84954231224&partnerID=8YFLogxK
U2 - 10.1109/INFOCOM.2015.7218468
DO - 10.1109/INFOCOM.2015.7218468
M3 - Conference contribution
AN - SCOPUS:84954231224
T3 - Proceedings - IEEE INFOCOM
SP - 963
EP - 972
BT - 2015 IEEE Conference on Computer Communications, IEEE INFOCOM 2015
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 26 April 2015 through 1 May 2015
ER -