TY - GEN
T1 - An efficient work-stealing scheduler for task dependency graph
AU - Lin, Chun Xun
AU - Huang, Tsung-Wei
AU - Wong, Martin D.F.
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/12
Y1 - 2020/12
N2 - Work-stealing is a key component of many parallel task graph libraries such as Intel Threading Building Blocks (TBB) FlowGraph, Microsoft Task Parallel Library (TPL) Batch.Net, Cpp-Taskflow, and Nabbit. However, designing a correct and effective work-stealing scheduler is a notoriously difficult job, due to subtle implementation details of concurrency controls and decentralized coordination between threads. This problem becomes even more challenging when striving for optimal thread usage in handling parallel workloads with complex task graphs. As a result, we introduce in this paper an effective work-stealing scheduler for execution of task dependency graphs. Our scheduler adopts a simple and efficient strategy to adapt the number of working threads to available task parallelism at any time during the graph execution. Our strategy is provably good in preventing resource underutilization and simultaneously minimizing resource waste when tasks are scarce. We have evaluated our scheduler on both micro-benchmarks and a real-world circuit timing analysis workload, and demonstrated promising results over existing methods in terms of runtime, energy efficiency, and throughput.
AB - Work-stealing is a key component of many parallel task graph libraries such as Intel Threading Building Blocks (TBB) FlowGraph, Microsoft Task Parallel Library (TPL) Batch.Net, Cpp-Taskflow, and Nabbit. However, designing a correct and effective work-stealing scheduler is a notoriously difficult job, due to subtle implementation details of concurrency controls and decentralized coordination between threads. This problem becomes even more challenging when striving for optimal thread usage in handling parallel workloads with complex task graphs. As a result, we introduce in this paper an effective work-stealing scheduler for execution of task dependency graphs. Our scheduler adopts a simple and efficient strategy to adapt the number of working threads to available task parallelism at any time during the graph execution. Our strategy is provably good in preventing resource underutilization and simultaneously minimizing resource waste when tasks are scarce. We have evaluated our scheduler on both micro-benchmarks and a real-world circuit timing analysis workload, and demonstrated promising results over existing methods in terms of runtime, energy efficiency, and throughput.
KW - Multithreading
KW - Parallel computing
KW - Scheduling
KW - Task dependency graph
KW - Work stealing
UR - http://www.scopus.com/inward/record.url?scp=85102350737&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102350737&partnerID=8YFLogxK
U2 - 10.1109/ICPADS51040.2020.00018
DO - 10.1109/ICPADS51040.2020.00018
M3 - Conference contribution
AN - SCOPUS:85102350737
T3 - Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
SP - 64
EP - 71
BT - Proceedings - 2020 IEEE 26th International Conference on Parallel and Distributed Systems, ICPADS 2020
PB - IEEE Computer Society
T2 - 26th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2020
Y2 - 2 December 2020 through 4 December 2020
ER -