Parsl: Scalable parallel scripting in python

Yadu Babuji, Kyle Chard, Ian Foster, Daniel S Katz, Michael Wilde, Anna Woodard, Justin Wozniak

Research output: Contribution to journalConference article

Abstract

Computational and data-driven research practices have significantly changed over the past decade to encompass new analysis models such as interactive and online computing. Science gateways are simultaneously evolving to support this transforming landscape with the aim to enable transparent, scalable execution of a variety of analyses. Science gateways often rely on workflow management systems to represent and execute analyses efficiently and reliably. However, integrating workflow systems in science gateways can be challenging, especially as analyses become more interactive and dynamic, requiring sophisticated orchestration and management of applications and data, and customization for specific execution environments. Parsl (Parallel Scripting Library), a Python library for programming and executing data-oriented workflows in parallel, addresses these problems. Developers simply annotate a Python script with Parsl directives wrapping either Python functions or calls to external applications. Parsl manages the execution of the script on clusters, clouds, grids, and other resources; orchestrates required data movement; and manages the execution of Python functions and external applications in parallel. The Parsl library can be easily integrated into Python-based gateways, allowing for simple management and scaling of workflows.

Original languageEnglish (US)
JournalCEUR Workshop Proceedings
Volume2357
StatePublished - Jan 1 2019
Event10th International Workshop on Science Gateways, IWSG 2018 - Edinburgh, United Kingdom
Duration: Jun 13 2018Jun 15 2018

Keywords

  • Parallel scripting
  • Parsl
  • Python
  • Scientific Workflows

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Babuji, Y., Chard, K., Foster, I., Katz, D. S., Wilde, M., Woodard, A., & Wozniak, J. (2019). Parsl: Scalable parallel scripting in python. CEUR Workshop Proceedings, 2357.

Parsl : Scalable parallel scripting in python. / Babuji, Yadu; Chard, Kyle; Foster, Ian; Katz, Daniel S; Wilde, Michael; Woodard, Anna; Wozniak, Justin.

In: CEUR Workshop Proceedings, Vol. 2357, 01.01.2019.

Research output: Contribution to journalConference article

Babuji, Y, Chard, K, Foster, I, Katz, DS, Wilde, M, Woodard, A & Wozniak, J 2019, 'Parsl: Scalable parallel scripting in python', CEUR Workshop Proceedings, vol. 2357.
Babuji Y, Chard K, Foster I, Katz DS, Wilde M, Woodard A et al. Parsl: Scalable parallel scripting in python. CEUR Workshop Proceedings. 2019 Jan 1;2357.
Babuji, Yadu ; Chard, Kyle ; Foster, Ian ; Katz, Daniel S ; Wilde, Michael ; Woodard, Anna ; Wozniak, Justin. / Parsl : Scalable parallel scripting in python. In: CEUR Workshop Proceedings. 2019 ; Vol. 2357.
@article{da707f895d0d4e87bd616bff30b9cc40,
title = "Parsl: Scalable parallel scripting in python",
abstract = "Computational and data-driven research practices have significantly changed over the past decade to encompass new analysis models such as interactive and online computing. Science gateways are simultaneously evolving to support this transforming landscape with the aim to enable transparent, scalable execution of a variety of analyses. Science gateways often rely on workflow management systems to represent and execute analyses efficiently and reliably. However, integrating workflow systems in science gateways can be challenging, especially as analyses become more interactive and dynamic, requiring sophisticated orchestration and management of applications and data, and customization for specific execution environments. Parsl (Parallel Scripting Library), a Python library for programming and executing data-oriented workflows in parallel, addresses these problems. Developers simply annotate a Python script with Parsl directives wrapping either Python functions or calls to external applications. Parsl manages the execution of the script on clusters, clouds, grids, and other resources; orchestrates required data movement; and manages the execution of Python functions and external applications in parallel. The Parsl library can be easily integrated into Python-based gateways, allowing for simple management and scaling of workflows.",
keywords = "Parallel scripting, Parsl, Python, Scientific Workflows",
author = "Yadu Babuji and Kyle Chard and Ian Foster and Katz, {Daniel S} and Michael Wilde and Anna Woodard and Justin Wozniak",
year = "2019",
month = "1",
day = "1",
language = "English (US)",
volume = "2357",
journal = "CEUR Workshop Proceedings",
issn = "1613-0073",
publisher = "CEUR-WS",

}

TY - JOUR

T1 - Parsl

T2 - Scalable parallel scripting in python

AU - Babuji, Yadu

AU - Chard, Kyle

AU - Foster, Ian

AU - Katz, Daniel S

AU - Wilde, Michael

AU - Woodard, Anna

AU - Wozniak, Justin

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Computational and data-driven research practices have significantly changed over the past decade to encompass new analysis models such as interactive and online computing. Science gateways are simultaneously evolving to support this transforming landscape with the aim to enable transparent, scalable execution of a variety of analyses. Science gateways often rely on workflow management systems to represent and execute analyses efficiently and reliably. However, integrating workflow systems in science gateways can be challenging, especially as analyses become more interactive and dynamic, requiring sophisticated orchestration and management of applications and data, and customization for specific execution environments. Parsl (Parallel Scripting Library), a Python library for programming and executing data-oriented workflows in parallel, addresses these problems. Developers simply annotate a Python script with Parsl directives wrapping either Python functions or calls to external applications. Parsl manages the execution of the script on clusters, clouds, grids, and other resources; orchestrates required data movement; and manages the execution of Python functions and external applications in parallel. The Parsl library can be easily integrated into Python-based gateways, allowing for simple management and scaling of workflows.

AB - Computational and data-driven research practices have significantly changed over the past decade to encompass new analysis models such as interactive and online computing. Science gateways are simultaneously evolving to support this transforming landscape with the aim to enable transparent, scalable execution of a variety of analyses. Science gateways often rely on workflow management systems to represent and execute analyses efficiently and reliably. However, integrating workflow systems in science gateways can be challenging, especially as analyses become more interactive and dynamic, requiring sophisticated orchestration and management of applications and data, and customization for specific execution environments. Parsl (Parallel Scripting Library), a Python library for programming and executing data-oriented workflows in parallel, addresses these problems. Developers simply annotate a Python script with Parsl directives wrapping either Python functions or calls to external applications. Parsl manages the execution of the script on clusters, clouds, grids, and other resources; orchestrates required data movement; and manages the execution of Python functions and external applications in parallel. The Parsl library can be easily integrated into Python-based gateways, allowing for simple management and scaling of workflows.

KW - Parallel scripting

KW - Parsl

KW - Python

KW - Scientific Workflows

UR - http://www.scopus.com/inward/record.url?scp=85065527336&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85065527336&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85065527336

VL - 2357

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

SN - 1613-0073

ER -