Computing location-based lineage from workflow specifications to optimize provenance queries

Saumen Dey, Sven Köhler, Shawn Bowers, Bertram Ludäscher

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a location-based approach for executing provenance lineage queries that significantly reduces query execution cost without incurring additional storage costs. The key idea of our approach is to exploit the fact that provenance graphs resemble the workflow graphs that generated them and that many workflow computation models assume workflow steps have statically defined data consumptionproduction (i.e., data input-output) rates. We describe a new lineage computation technique that uses the structure of workflow specifications together with consumption-production rates to pre-compute (i.e., to forecast) the access paths of all dependent data items prior to workflow execution. We also present experimental results showing that our approach can significantly out perform traditional data lineage query techniques.

Original languageEnglish (US)
Title of host publicationProvenance and Annotation of Data and Processes - 5th International Provenance and Annotation Workshop, IPAW 2014, Revised Selected Papers
EditorsBeth Plale, Bertram Ludäscher, Bertram Ludäscher
PublisherSpringer
Pages180-193
Number of pages14
ISBN (Electronic)9783319164618
DOIs
StatePublished - 2015
Externally publishedYes
Event5th International Provenance and Annotation Workshop, IPAW 2014 - Cologne, Germany
Duration: Jun 10 2014Jun 11 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8628
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other5th International Provenance and Annotation Workshop, IPAW 2014
Country/TerritoryGermany
CityCologne
Period6/10/146/11/14

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Computing location-based lineage from workflow specifications to optimize provenance queries'. Together they form a unique fingerprint.

Cite this