TY - GEN
T1 - Stark
T2 - 37th IEEE International Conference on Distributed Computing Systems, ICDCS 2017
AU - Li, Shen
AU - Amin, Md Tanvir
AU - Ganti, Raghu
AU - Srivatsa, Mudhakar
AU - Hu, Shanhao
AU - Zhao, Yiran
AU - Abdelzaher, Tarek
N1 - Research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-09-2-0053 (the ARL Network Science CTA) and NSF CNS 13-20209.
PY - 2017/7/13
Y1 - 2017/7/13
N2 - Emerging distributed in-memory computing frameworks, such as Apache Spark, can process a huge amount of cached data within seconds. This remarkably high efficiency requires the system to well balance data across tasks and ensure data locality. However, it is challenging to satisfy these requirements for applications that operate on a collection of dynamically loaded and evicted datasets. The dynamics may lead to time-varying data volume and distribution, which would frequently invoke expensive data re-partition and transfer operations, resulting in high overhead and large delay. To address this problem, we present Stark, a system specifically designed for optimizing in-memory computing on dynamic dataset collections. Stark enforces data locality for transformations spanning multiple datasets (e.g., join and cogroup) to avoid unnecessary data replications and shuffles. Moreover, to accommodate fluctuating data volume and skeweddata distribution, Stark delivers elasticity into partitions to balance task execution time andreduce job makespan. Finally, Stark achieves bounded failure recovery latency byoptimizing the data checkpointing strategy. Evaluations on a 50-server cluster show that Stark reduces the job makespan by 4X and improves system throughput by 6X compared to Spark.
AB - Emerging distributed in-memory computing frameworks, such as Apache Spark, can process a huge amount of cached data within seconds. This remarkably high efficiency requires the system to well balance data across tasks and ensure data locality. However, it is challenging to satisfy these requirements for applications that operate on a collection of dynamically loaded and evicted datasets. The dynamics may lead to time-varying data volume and distribution, which would frequently invoke expensive data re-partition and transfer operations, resulting in high overhead and large delay. To address this problem, we present Stark, a system specifically designed for optimizing in-memory computing on dynamic dataset collections. Stark enforces data locality for transformations spanning multiple datasets (e.g., join and cogroup) to avoid unnecessary data replications and shuffles. Moreover, to accommodate fluctuating data volume and skeweddata distribution, Stark delivers elasticity into partitions to balance task execution time andreduce job makespan. Finally, Stark achieves bounded failure recovery latency byoptimizing the data checkpointing strategy. Evaluations on a 50-server cluster show that Stark reduces the job makespan by 4X and improves system throughput by 6X compared to Spark.
UR - http://www.scopus.com/inward/record.url?scp=85027278926&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85027278926&partnerID=8YFLogxK
U2 - 10.1109/ICDCS.2017.143
DO - 10.1109/ICDCS.2017.143
M3 - Conference contribution
AN - SCOPUS:85027278926
T3 - Proceedings - International Conference on Distributed Computing Systems
SP - 103
EP - 114
BT - Proceedings - IEEE 37th International Conference on Distributed Computing Systems, ICDCS 2017
A2 - Lee, Kisung
A2 - Liu, Ling
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 5 June 2017 through 8 June 2017
ER -