Understanding collaborative studies through interoperable workflow provenance

Ilkay Altintas, Manish Kumar Anand, Daniel Crawl, Shawn Bowers, Adam Belloum, Paolo Missier, Bertram Ludäscher, Carole A. Goble, Peter M.A. Sloot

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The provenance of a data product contains information about how the product was derived, and is crucial for enabling scientists to easily understand, reproduce, and verify scientific results. Currently, most provenance models are designed to capture the provenance related to a single run, and mostly executed by a single user. However, a scientific discovery is often the result of methodical execution of many scientific workflows with many datasets produced at different times by one or more users. Further, to promote and facilitate exchange of information between multiple workflow systems supporting provenance, the Open Provenance Model (OPM) has been proposed by the scientific workflow community. In this paper, we describe a new query model that captures implicit user collaborations. We show how this model maps to OPM and helps to answer collaborative queries, e.g., identifying combined workflows and contributions of users collaborating on a project based on the records of previous workflow executions. We also adopt and extend the high-level Query Language for Provenance (QLP) with additional constructs, and show how these extensions allow non-expert users to express collaborative provenance queries against this model easily and concisely. Furthermore, we adopt the Provenance Challenge 3 (PC3) workflows as a collaborative and interoperable usecase scenario, where different stages of the workflow are executed in three different workflow environments - Kepler, Taverna, and WSVLAM. Through this usecase, we demonstrate how we can establish and understand collaborative studies through interoperable workflow provenance.

Original languageEnglish (US)
Title of host publicationProvenance and Annotation of Data and Processes - Third International Provenance and Annotation Workshop, IPAW 2010, Revised Selected Papers
Pages42-58
Number of pages17
DOIs
StatePublished - 2010
Externally publishedYes
Event3rd International Provenance and Annotation Workshop, IPAW 2010 - Troy, NY, United States
Duration: Jun 15 2010Jun 16 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6378 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other3rd International Provenance and Annotation Workshop, IPAW 2010
Country/TerritoryUnited States
CityTroy, NY
Period6/15/106/16/10

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Understanding collaborative studies through interoperable workflow provenance'. Together they form a unique fingerprint.

Cite this