Enabling Scientific Workflow Reuse through Structured Composition of Dataflow and Control-Flow

Shawn Bowers, Bertram Ludäscher, Anne H.H. Ngu, Terence Critchlow

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data-centric scientific workflows are often modeled as dataflow process networks. The simplicity of the dataflow framework facilitates workflow design, analysis, and optimization. However, modeling "control-flow intensive" tasks using dataflow constructs often leads to overly complicated workflows that are hard to comprehend, reuse, and maintain. We describe a generic framework, based on scientific workflow templates and frames, for embedding control-flow intensive subtasks within dataflow process networks. This approach can seamlessly handle complex control-flow without sacrificing the benefits of dataflow. We illustrate our approach with a real-world scientific workflow from the astrophysics domain, requiring remote execution and file transfer in a semi-reliable environment. For such workflows, we also describe a 3-layered architecture based on frames and templates where the top-layer consists of an overall dataflow process network, the second layer consists of a tranducer template for modeling the desired control-flow behavior, and the bottom layer consists of frames inside the template that are specialized by embedding the desired component implementation. Our approach can enable scientific workflows that are more robust (faulttolerance strategies can be defined by control-flow driven transducer templates) and at the same time more reusable, since the embedding of frames and templates yields more structured and modular workflow designs.

Original languageEnglish (US)
Title of host publicationICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops
EditorsRoger S. Barga, Xiaofang Zhou
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages70-79
Number of pages10
ISBN (Electronic)0769525717, 9780769525716
DOIs
StatePublished - 2006
Externally publishedYes
Event22nd International Conference on Data Engineering Workshops, ICDEW 2006 - Atlanta, United States
Duration: Apr 3 2006Apr 7 2006

Publication series

NameICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops

Other

Other22nd International Conference on Data Engineering Workshops, ICDEW 2006
Country/TerritoryUnited States
CityAtlanta
Period4/3/064/7/06

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Enabling Scientific Workflow Reuse through Structured Composition of Dataflow and Control-Flow'. Together they form a unique fingerprint.

Cite this