Provenance summaries for answers and nonanswers

Seokki Lee, Bertram Ludaescher, Boris Glavic

Research output: Contribution to journalConference article

Abstract

Explaining why an answer is (not) in the result of a query has proven to be of immense importance for many applications. However, why-not provenance, and to a lesser degree also why-provenance, can be very large, even for small input datasets. The resulting scalability and usability issues have limited the applicability of provenance. We present PUG, a system for why and why-not provenance that applies a range of novel techniques to overcome these challenges. Specifically, PUG limits provenance capture to what is relevant to explain a (missing) result of interest and uses an efficient sampling-based summarization method to produce compact explanations for (missing) answers. Using two real-world datasets, we demonstrate how a user can draw meaningful insights from explanations produced by PUG.

Original languageEnglish (US)
Pages (from-to)1954-1957
Number of pages4
JournalProceedings of the VLDB Endowment
Volume11
Issue number12
DOIs
StatePublished - Jan 1 2018
Event44th International Conference on Very Large Data Bases, VLDB 2018 - Rio de Janeiro, Brazil
Duration: Aug 27 2017Aug 31 2017

Fingerprint

Scalability
Sampling

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Provenance summaries for answers and nonanswers. / Lee, Seokki; Ludaescher, Bertram; Glavic, Boris.

In: Proceedings of the VLDB Endowment, Vol. 11, No. 12, 01.01.2018, p. 1954-1957.

Research output: Contribution to journalConference article

Lee, Seokki ; Ludaescher, Bertram ; Glavic, Boris. / Provenance summaries for answers and nonanswers. In: Proceedings of the VLDB Endowment. 2018 ; Vol. 11, No. 12. pp. 1954-1957.
@article{9b9be78a104d4042874fa666213890cd,
title = "Provenance summaries for answers and nonanswers",
abstract = "Explaining why an answer is (not) in the result of a query has proven to be of immense importance for many applications. However, why-not provenance, and to a lesser degree also why-provenance, can be very large, even for small input datasets. The resulting scalability and usability issues have limited the applicability of provenance. We present PUG, a system for why and why-not provenance that applies a range of novel techniques to overcome these challenges. Specifically, PUG limits provenance capture to what is relevant to explain a (missing) result of interest and uses an efficient sampling-based summarization method to produce compact explanations for (missing) answers. Using two real-world datasets, we demonstrate how a user can draw meaningful insights from explanations produced by PUG.",
author = "Seokki Lee and Bertram Ludaescher and Boris Glavic",
year = "2018",
month = "1",
day = "1",
doi = "10.14778/3229863.3236233",
language = "English (US)",
volume = "11",
pages = "1954--1957",
journal = "Proceedings of the VLDB Endowment",
issn = "2150-8097",
publisher = "Very Large Data Base Endowment Inc.",
number = "12",

}

TY - JOUR

T1 - Provenance summaries for answers and nonanswers

AU - Lee, Seokki

AU - Ludaescher, Bertram

AU - Glavic, Boris

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Explaining why an answer is (not) in the result of a query has proven to be of immense importance for many applications. However, why-not provenance, and to a lesser degree also why-provenance, can be very large, even for small input datasets. The resulting scalability and usability issues have limited the applicability of provenance. We present PUG, a system for why and why-not provenance that applies a range of novel techniques to overcome these challenges. Specifically, PUG limits provenance capture to what is relevant to explain a (missing) result of interest and uses an efficient sampling-based summarization method to produce compact explanations for (missing) answers. Using two real-world datasets, we demonstrate how a user can draw meaningful insights from explanations produced by PUG.

AB - Explaining why an answer is (not) in the result of a query has proven to be of immense importance for many applications. However, why-not provenance, and to a lesser degree also why-provenance, can be very large, even for small input datasets. The resulting scalability and usability issues have limited the applicability of provenance. We present PUG, a system for why and why-not provenance that applies a range of novel techniques to overcome these challenges. Specifically, PUG limits provenance capture to what is relevant to explain a (missing) result of interest and uses an efficient sampling-based summarization method to produce compact explanations for (missing) answers. Using two real-world datasets, we demonstrate how a user can draw meaningful insights from explanations produced by PUG.

UR - http://www.scopus.com/inward/record.url?scp=85058882527&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85058882527&partnerID=8YFLogxK

U2 - 10.14778/3229863.3236233

DO - 10.14778/3229863.3236233

M3 - Conference article

AN - SCOPUS:85058882527

VL - 11

SP - 1954

EP - 1957

JO - Proceedings of the VLDB Endowment

JF - Proceedings of the VLDB Endowment

SN - 2150-8097

IS - 12

ER -