TY - GEN
T1 - A declarative approach to customize workflow provenance
AU - Dey, Saumen
AU - Ludascher, Bertram
PY - 2013/5/2
Y1 - 2013/5/2
N2 - Provenance describes the origin, context, derivation, and ownership of data products and is becoming increasingly important in scientific applications. This information can be used, e.g., to explain, debug, and reproduce the results of computational experiments, or to determine the validity and quality of data products. In contrast, it may be infeasible or undesirable to share complete provenance of a scientific experiment. Towards finding a balance between these requirements, we develop a framework and a system that allows scientists to declaratively specify their provenance data publication and customization requirements. Using this system, scientists can specify which parts of the provenance data are to be included in the result and which parts should be hidden, or anonymized. However, arbitrary application of these specifications may not maintain provenance data integrity. Thus, we allow scientists to specify provenance data integrity requirements, in form of provenance policies, along with their provenance data publication and customization requirements. Our system then systematically applies all the publication and customization requirements on the provenance data and ensures all the provenance policies as specified by the scientist.
AB - Provenance describes the origin, context, derivation, and ownership of data products and is becoming increasingly important in scientific applications. This information can be used, e.g., to explain, debug, and reproduce the results of computational experiments, or to determine the validity and quality of data products. In contrast, it may be infeasible or undesirable to share complete provenance of a scientific experiment. Towards finding a balance between these requirements, we develop a framework and a system that allows scientists to declaratively specify their provenance data publication and customization requirements. Using this system, scientists can specify which parts of the provenance data are to be included in the result and which parts should be hidden, or anonymized. However, arbitrary application of these specifications may not maintain provenance data integrity. Thus, we allow scientists to specify provenance data integrity requirements, in form of provenance policies, along with their provenance data publication and customization requirements. Our system then systematically applies all the publication and customization requirements on the provenance data and ensures all the provenance policies as specified by the scientist.
UR - http://www.scopus.com/inward/record.url?scp=84876793833&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84876793833&partnerID=8YFLogxK
U2 - 10.1145/2457317.2457320
DO - 10.1145/2457317.2457320
M3 - Conference contribution
AN - SCOPUS:84876793833
SN - 9781450315999
T3 - ACM International Conference Proceeding Series
SP - 9
EP - 16
BT - Joint EDBT/ICDT 2013 Workshops - Proceedings
T2 - Joint EDBT/ICDT 2013 Workshops
Y2 - 18 March 2013 through 22 March 2013
ER -