TY - GEN
T1 - Computing location-based lineage from workflow specifications to optimize provenance queries
AU - Dey, Saumen
AU - Köhler, Sven
AU - Bowers, Shawn
AU - Ludäscher, Bertram
N1 - Funding Information:
Supported in part by NSF ACI-0830944 and IIS-1118088.
Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - We present a location-based approach for executing provenance lineage queries that significantly reduces query execution cost without incurring additional storage costs. The key idea of our approach is to exploit the fact that provenance graphs resemble the workflow graphs that generated them and that many workflow computation models assume workflow steps have statically defined data consumptionproduction (i.e., data input-output) rates. We describe a new lineage computation technique that uses the structure of workflow specifications together with consumption-production rates to pre-compute (i.e., to forecast) the access paths of all dependent data items prior to workflow execution. We also present experimental results showing that our approach can significantly out perform traditional data lineage query techniques.
AB - We present a location-based approach for executing provenance lineage queries that significantly reduces query execution cost without incurring additional storage costs. The key idea of our approach is to exploit the fact that provenance graphs resemble the workflow graphs that generated them and that many workflow computation models assume workflow steps have statically defined data consumptionproduction (i.e., data input-output) rates. We describe a new lineage computation technique that uses the structure of workflow specifications together with consumption-production rates to pre-compute (i.e., to forecast) the access paths of all dependent data items prior to workflow execution. We also present experimental results showing that our approach can significantly out perform traditional data lineage query techniques.
UR - http://www.scopus.com/inward/record.url?scp=84928798542&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84928798542&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-16462-5_14
DO - 10.1007/978-3-319-16462-5_14
M3 - Conference contribution
AN - SCOPUS:84928798542
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 180
EP - 193
BT - Provenance and Annotation of Data and Processes - 5th International Provenance and Annotation Workshop, IPAW 2014, Revised Selected Papers
A2 - Plale, Beth
A2 - Ludäscher, Bertram
A2 - Ludäscher, Bertram
PB - Springer
T2 - 5th International Provenance and Annotation Workshop, IPAW 2014
Y2 - 10 June 2014 through 11 June 2014
ER -