Parsl: Pervasive parallel programming in Python

Yadu Babuji, Anna Woodard, Zhuozhao Li, Daniel S. Katz, Ben Clifford, Rohan Kumar, Lukasz Lacinski, Ryan Chard, Justin M. Wozniak, Ian Foster, Michael Wilde, Kyle Chard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

High-level programming languages such as Python are increasingly used to provide intuitive interfaces to libraries written in lower-level languages and for assembling applications from various components. This migration towards orchestration rather than implementation, coupled with the growing need for parallel computing (e.g., due to big data and the end of Moore's law), necessitates rethinking how parallelism is expressed in programs. Here, we present Parsl, a parallel scripting library that augments Python with simple, scalable, and flexible constructs for encoding parallelism. These constructs allow Parsl to construct a dynamic dependency graph of components that it can then execute efficiently on one or many processors. Parsl is designed for scalability, with an extensible set of executors tailored to different use cases, such as low-latency, high-throughput, or extreme-scale execution. We show, via experiments on the Blue Waters supercomputer, that Parsl executors can allow Python scripts to execute components with as little as 5 ms of overhead, scale to more than 250 000 workers across more than 8000 nodes, and process upward of 1200 tasks per second. Other Parsl features simplify the construction and execution of composite programs by supporting elastic provisioning and scaling of infrastructure, fault-tolerant execution, and integrated wide-area data management. We show that these capabilities satisfy the needs of many-task, interactive, online, and machine learning applications in fields such as biology, cosmology, and materials science.

Original languageEnglish (US)
Title of host publicationHPDC 2019- Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing
PublisherAssociation for Computing Machinery, Inc
Pages25-36
Number of pages12
ISBN (Electronic)9781450366700
DOIs
StatePublished - Jun 17 2019
Event28th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2019 - Phoenix, United States
Duration: Jun 22 2019Jun 29 2019

Publication series

NameHPDC 2019- Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing

Conference

Conference28th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2019
CountryUnited States
CityPhoenix
Period6/22/196/29/19

Keywords

  • Parallel programming
  • Parsl
  • Python

ASJC Scopus subject areas

  • Software
  • Computational Theory and Mathematics
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Parsl: Pervasive parallel programming in Python'. Together they form a unique fingerprint.

  • Cite this

    Babuji, Y., Woodard, A., Li, Z., Katz, D. S., Clifford, B., Kumar, R., Lacinski, L., Chard, R., Wozniak, J. M., Foster, I., Wilde, M., & Chard, K. (2019). Parsl: Pervasive parallel programming in Python. In HPDC 2019- Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing (pp. 25-36). (HPDC 2019- Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing). Association for Computing Machinery, Inc. https://doi.org/10.1145/3307681.3325400