A SQL-Middleware unifying why and why-not provenance for first-order queries

Seokki Lee, Sven Köhler, Bertram Ludäscher, Boris Glavic

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Explaining why an answer is in the result of a query or why it is missing from the result is important for many applications including auditing, debugging data and queries, and answering hypothetical questions about data. Both types of questions, i.e., why and why-not provenance, have been studied extensively. In this work, we present the first practical approach for answering such questions for queries with negation (firstorder queries). Our approach is based on a rewriting of Datalog rules (called firing rules) that captures successful rule derivations within the context of a Datalog query. We extend this rewriting to support negation and to capture failed derivations that explain missing answers. Given a (why or why-not) provenance question, we compute an explanation, i.e., the part of the provenance that is relevant to answer the question. We introduce optimizations that prune parts of a provenance graph early on if we can determine that they will not be part of the explanation for a given question. We present an implementation that runs on top of a relational database using SQL to compute explanations. Our experiments demonstrate that our approach scales to large instances and significantly outperforms an earlier approach which instantiates the full provenance to compute explanations.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017
PublisherIEEE Computer Society
Pages485-496
Number of pages12
ISBN (Electronic)9781509065431
DOIs
StatePublished - May 16 2017
Externally publishedYes
Event33rd IEEE International Conference on Data Engineering, ICDE 2017 - San Diego, United States
Duration: Apr 19 2017Apr 22 2017

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627

Other

Other33rd IEEE International Conference on Data Engineering, ICDE 2017
CountryUnited States
CitySan Diego
Period4/19/174/22/17

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint Dive into the research topics of 'A SQL-Middleware unifying why and why-not provenance for first-order queries'. Together they form a unique fingerprint.

  • Cite this

    Lee, S., Köhler, S., Ludäscher, B., & Glavic, B. (2017). A SQL-Middleware unifying why and why-not provenance for first-order queries. In Proceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017 (pp. 485-496). [7930001] (Proceedings - International Conference on Data Engineering). IEEE Computer Society. https://doi.org/10.1109/ICDE.2017.105