DtCraft: A High-Performance Distributed Execution Engine at Scale

Research output: Contribution to journalArticle

Abstract

Recent years have seen rapid growth in data-driven distributed systems, such as Hadoop MapReduce, Spark, and Dryad. However, the counterparts for high-performance or compute-intensive applications including large-scale optimizations, modeling, and simulations are still nascent. In this paper, we introduce DtCraft, a modern C++ based distributed execution engine to streamline the development of high-performance parallel applications. Users need no understanding of distributed computing and can focus on high-level developments, leaving difficult details, such as concurrency controls, workload distribution, and fault tolerance handled by our system transparently. We have evaluated DtCraft on both micro-benchmarks and large-scale optimization problems, and shown the promising performance from single multicore machines to clusters of computers. In a particular semiconductor design problem, we achieved 30 × speedup with 40 nodes and 15 × less development efforts over hand-crafted implementation.

Original languageEnglish (US)
Article number8355904
Pages (from-to)1070-1083
Number of pages14
JournalIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Volume38
Issue number6
DOIs
StatePublished - Jun 2019

Fingerprint

Engines
Concurrency control
Distributed computer systems
Fault tolerance
Electric sparks
Semiconductor materials

Keywords

  • Distributed computing
  • parallel programming

ASJC Scopus subject areas

  • Software
  • Computer Graphics and Computer-Aided Design
  • Electrical and Electronic Engineering

Cite this

DtCraft : A High-Performance Distributed Execution Engine at Scale. / Huang, Tsung-Wei; Lin, Chun Xun; Wong, Martin D F.

In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 38, No. 6, 8355904, 06.2019, p. 1070-1083.

Research output: Contribution to journalArticle

@article{3db7007acc004dc7bfd136e6ff2aaa87,
title = "DtCraft: A High-Performance Distributed Execution Engine at Scale",
abstract = "Recent years have seen rapid growth in data-driven distributed systems, such as Hadoop MapReduce, Spark, and Dryad. However, the counterparts for high-performance or compute-intensive applications including large-scale optimizations, modeling, and simulations are still nascent. In this paper, we introduce DtCraft, a modern C++ based distributed execution engine to streamline the development of high-performance parallel applications. Users need no understanding of distributed computing and can focus on high-level developments, leaving difficult details, such as concurrency controls, workload distribution, and fault tolerance handled by our system transparently. We have evaluated DtCraft on both micro-benchmarks and large-scale optimization problems, and shown the promising performance from single multicore machines to clusters of computers. In a particular semiconductor design problem, we achieved 30 × speedup with 40 nodes and 15 × less development efforts over hand-crafted implementation.",
keywords = "Distributed computing, parallel programming",
author = "Tsung-Wei Huang and Lin, {Chun Xun} and Wong, {Martin D F}",
year = "2019",
month = "6",
doi = "10.1109/TCAD.2018.2834422",
language = "English (US)",
volume = "38",
pages = "1070--1083",
journal = "IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems",
issn = "0278-0070",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "6",

}

TY - JOUR

T1 - DtCraft

T2 - A High-Performance Distributed Execution Engine at Scale

AU - Huang, Tsung-Wei

AU - Lin, Chun Xun

AU - Wong, Martin D F

PY - 2019/6

Y1 - 2019/6

N2 - Recent years have seen rapid growth in data-driven distributed systems, such as Hadoop MapReduce, Spark, and Dryad. However, the counterparts for high-performance or compute-intensive applications including large-scale optimizations, modeling, and simulations are still nascent. In this paper, we introduce DtCraft, a modern C++ based distributed execution engine to streamline the development of high-performance parallel applications. Users need no understanding of distributed computing and can focus on high-level developments, leaving difficult details, such as concurrency controls, workload distribution, and fault tolerance handled by our system transparently. We have evaluated DtCraft on both micro-benchmarks and large-scale optimization problems, and shown the promising performance from single multicore machines to clusters of computers. In a particular semiconductor design problem, we achieved 30 × speedup with 40 nodes and 15 × less development efforts over hand-crafted implementation.

AB - Recent years have seen rapid growth in data-driven distributed systems, such as Hadoop MapReduce, Spark, and Dryad. However, the counterparts for high-performance or compute-intensive applications including large-scale optimizations, modeling, and simulations are still nascent. In this paper, we introduce DtCraft, a modern C++ based distributed execution engine to streamline the development of high-performance parallel applications. Users need no understanding of distributed computing and can focus on high-level developments, leaving difficult details, such as concurrency controls, workload distribution, and fault tolerance handled by our system transparently. We have evaluated DtCraft on both micro-benchmarks and large-scale optimization problems, and shown the promising performance from single multicore machines to clusters of computers. In a particular semiconductor design problem, we achieved 30 × speedup with 40 nodes and 15 × less development efforts over hand-crafted implementation.

KW - Distributed computing

KW - parallel programming

UR - http://www.scopus.com/inward/record.url?scp=85046740660&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046740660&partnerID=8YFLogxK

U2 - 10.1109/TCAD.2018.2834422

DO - 10.1109/TCAD.2018.2834422

M3 - Article

AN - SCOPUS:85046740660

VL - 38

SP - 1070

EP - 1083

JO - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

JF - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

SN - 0278-0070

IS - 6

M1 - 8355904

ER -