TY - GEN
T1 - Supporting on-demand elasticity in distributed graph processing
AU - Pundir, Mayank
AU - Kumar, Manoj
AU - Leslie, Luke M.
AU - Gupta, Indranil
AU - Campbell, Roy H.
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/6/1
Y1 - 2016/6/1
N2 - While distributed graph processing engines have become popular for processing large graphs, these engines are typically configured with a static set of servers in the cluster. In other words, they lack the flexibility to scale-out or scale-in the number of servers, when requested to do so by the user. In this paper, we propose the first techniques to make distributed graph processing truly elastic. While supporting on-demand scale-out/in operations, we meet three goals: i) perform scale-out/in without interrupting the graph computation, ii) minimize the background network overhead involved in the scale-out/in, and iii) mitigate stragglers by maintaining load balance across servers. We present and analyze two techniques called Contiguous Vertex Repartitioning (CVR) and Ring-based Vertex Repartitioning (RVR) to address these goals. We implement our techniques in the LFGraph distributed graph processing system, and incorporate several systems optimizations. Experiments performed with multiple graph benchmark applications on a real graph indicate that our techniques perform within 9% and 21% of the optimum for scale-out and scale-in operations, respectively.
AB - While distributed graph processing engines have become popular for processing large graphs, these engines are typically configured with a static set of servers in the cluster. In other words, they lack the flexibility to scale-out or scale-in the number of servers, when requested to do so by the user. In this paper, we propose the first techniques to make distributed graph processing truly elastic. While supporting on-demand scale-out/in operations, we meet three goals: i) perform scale-out/in without interrupting the graph computation, ii) minimize the background network overhead involved in the scale-out/in, and iii) mitigate stragglers by maintaining load balance across servers. We present and analyze two techniques called Contiguous Vertex Repartitioning (CVR) and Ring-based Vertex Repartitioning (RVR) to address these goals. We implement our techniques in the LFGraph distributed graph processing system, and incorporate several systems optimizations. Experiments performed with multiple graph benchmark applications on a real graph indicate that our techniques perform within 9% and 21% of the optimum for scale-out and scale-in operations, respectively.
UR - http://www.scopus.com/inward/record.url?scp=84978062212&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84978062212&partnerID=8YFLogxK
U2 - 10.1109/IC2E.2016.31
DO - 10.1109/IC2E.2016.31
M3 - Conference contribution
AN - SCOPUS:84978062212
T3 - Proceedings - 2016 IEEE International Conference on Cloud Engineering, IC2E 2016: Co-located with the 1st IEEE International Conference on Internet-of-Things Design and Implementation, IoTDI 2016
SP - 12
EP - 21
BT - Proceedings - 2016 IEEE International Conference on Cloud Engineering, IC2E 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE Annual International Conference on Cloud Engineering, IC2E 2016
Y2 - 4 April 2016 through 8 April 2016
ER -