PUG: a framework and practical implementation for why and why-not provenance

Seokki Lee, Bertram Ludäscher, Boris Glavic

Research output: Contribution to journalArticlepeer-review

Abstract

Explaining why an answer is (or is not) returned by a query is important for many applications including auditing, debugging data and queries, and answering hypothetical questions about data. In this work, we present the first practical approach for answering such questions for queries with negation (first-order queries). Specifically, we introduce a graph-based provenance model that, while syntactic in nature, supports reverse reasoning and is proven to encode a wide range of provenance models from the literature. The implementation of this model in our PUG (Provenance Unification through Graphs) system takes a provenance question and Datalog query as an input and generates a Datalog program that computes an explanation, i.e., the part of the provenance that is relevant to answer the question. Furthermore, we demonstrate how a desirable factorization of provenance can be achieved by rewriting an input query. We experimentally evaluate our approach demonstrating its efficiency.

Original languageEnglish (US)
Pages (from-to)47-71
Number of pages25
JournalVLDB Journal
Volume28
Issue number1
DOIs
StatePublished - Feb 1 2019

Keywords

  • Datalog
  • Missing answers
  • Provenance
  • Semirings

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'PUG: a framework and practical implementation for why and why-not provenance'. Together they form a unique fingerprint.

Cite this