Provenance in collection-oriented scientific workflows

Shawn Bowers, Timothy M. McPhillips, Bertram Ludäscher

Research output: Contribution to journalArticlepeer-review

Abstract

We describe a provenance model tailored to scientific workflows based on the collection-oriented modeling and design paradigm. Our implementation within the Kepler scientific workflow system captures the dependencies of data and collection creation events on preexisting data and collections, and embeds these provenance records within the data stream. A provenance query engine operates on self-contained workflow traces representing serializations of the output data stream for particular workflow runs. We demonstrate this approach in our response to the first provenance challenge.

Original languageEnglish (US)
Pages (from-to)519-529
Number of pages11
JournalConcurrency Computation Practice and Experience
Volume20
Issue number5
DOIs
StatePublished - Apr 10 2008
Externally publishedYes

Keywords

  • Collection-oriented scientific workflows
  • Provenance
  • Scientific data management

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Computer Science Applications
  • Computer Networks and Communications
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Provenance in collection-oriented scientific workflows'. Together they form a unique fingerprint.

Cite this