TY - JOUR
T1 - DtCraft
T2 - A High-Performance Distributed Execution Engine at Scale
AU - Huang, Tsung Wei
AU - Lin, Chun Xun
AU - Wong, Martin D.F.
N1 - Funding Information:
Manuscript received November 26, 2017; revised February 27, 2018; accepted May 4, 2018. Date of publication May 8, 2018; date of current version May 20, 2019. This work was supported by the National Science Foundation under Grant CCF-1421563 and Grant CCF-171883. Preliminary version of this paper is presented at the 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD’17), Irvine, CA, USA, November 2017 [1]. This paper was recommended by Associate Editor A. Srivastava. (Corresponding author: Tsung-Wei Huang.) The authors are with the Department of Electrical and Computer Engineering, University of Illinois at Urbana–Champaign, Champaign, IL 61801 USA (e-mail: twh760812@gmail.com; clin99@illinois.edu; mdfwong@illinois.edu).
Publisher Copyright:
© 1982-2012 IEEE.
PY - 2019/6
Y1 - 2019/6
N2 - Recent years have seen rapid growth in data-driven distributed systems, such as Hadoop MapReduce, Spark, and Dryad. However, the counterparts for high-performance or compute-intensive applications including large-scale optimizations, modeling, and simulations are still nascent. In this paper, we introduce DtCraft, a modern C++ based distributed execution engine to streamline the development of high-performance parallel applications. Users need no understanding of distributed computing and can focus on high-level developments, leaving difficult details, such as concurrency controls, workload distribution, and fault tolerance handled by our system transparently. We have evaluated DtCraft on both micro-benchmarks and large-scale optimization problems, and shown the promising performance from single multicore machines to clusters of computers. In a particular semiconductor design problem, we achieved 30 × speedup with 40 nodes and 15 × less development efforts over hand-crafted implementation.
AB - Recent years have seen rapid growth in data-driven distributed systems, such as Hadoop MapReduce, Spark, and Dryad. However, the counterparts for high-performance or compute-intensive applications including large-scale optimizations, modeling, and simulations are still nascent. In this paper, we introduce DtCraft, a modern C++ based distributed execution engine to streamline the development of high-performance parallel applications. Users need no understanding of distributed computing and can focus on high-level developments, leaving difficult details, such as concurrency controls, workload distribution, and fault tolerance handled by our system transparently. We have evaluated DtCraft on both micro-benchmarks and large-scale optimization problems, and shown the promising performance from single multicore machines to clusters of computers. In a particular semiconductor design problem, we achieved 30 × speedup with 40 nodes and 15 × less development efforts over hand-crafted implementation.
KW - Distributed computing
KW - parallel programming
UR - http://www.scopus.com/inward/record.url?scp=85046740660&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046740660&partnerID=8YFLogxK
U2 - 10.1109/TCAD.2018.2834422
DO - 10.1109/TCAD.2018.2834422
M3 - Article
AN - SCOPUS:85046740660
SN - 0278-0070
VL - 38
SP - 1070
EP - 1083
JO - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
JF - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IS - 6
M1 - 8355904
ER -