Abstract
Explaining why an answer is (not) in the result of a query has proven to be of immense importance for many applications. However, why-not provenance, and to a lesser degree also why-provenance, can be very large, even for small input datasets. The resulting scalability and usability issues have limited the applicability of provenance. We present PUG, a system for why and why-not provenance that applies a range of novel techniques to overcome these challenges. Specifically, PUG limits provenance capture to what is relevant to explain a (missing) result of interest and uses an efficient sampling-based summarization method to produce compact explanations for (missing) answers. Using two real-world datasets, we demonstrate how a user can draw meaningful insights from explanations produced by PUG.
Original language | English (US) |
---|---|
Pages (from-to) | 1954-1957 |
Number of pages | 4 |
Journal | Proceedings of the VLDB Endowment |
Volume | 11 |
Issue number | 12 |
DOIs | |
State | Published - Jan 1 2018 |
Event | 44th International Conference on Very Large Data Bases, VLDB 2018 - Rio de Janeiro, Brazil Duration: Aug 27 2017 → Aug 31 2017 |
Fingerprint
ASJC Scopus subject areas
- Computer Science (miscellaneous)
- Computer Science(all)
Cite this
Provenance summaries for answers and nonanswers. / Lee, Seokki; Ludaescher, Bertram; Glavic, Boris.
In: Proceedings of the VLDB Endowment, Vol. 11, No. 12, 01.01.2018, p. 1954-1957.Research output: Contribution to journal › Conference article
}
TY - JOUR
T1 - Provenance summaries for answers and nonanswers
AU - Lee, Seokki
AU - Ludaescher, Bertram
AU - Glavic, Boris
PY - 2018/1/1
Y1 - 2018/1/1
N2 - Explaining why an answer is (not) in the result of a query has proven to be of immense importance for many applications. However, why-not provenance, and to a lesser degree also why-provenance, can be very large, even for small input datasets. The resulting scalability and usability issues have limited the applicability of provenance. We present PUG, a system for why and why-not provenance that applies a range of novel techniques to overcome these challenges. Specifically, PUG limits provenance capture to what is relevant to explain a (missing) result of interest and uses an efficient sampling-based summarization method to produce compact explanations for (missing) answers. Using two real-world datasets, we demonstrate how a user can draw meaningful insights from explanations produced by PUG.
AB - Explaining why an answer is (not) in the result of a query has proven to be of immense importance for many applications. However, why-not provenance, and to a lesser degree also why-provenance, can be very large, even for small input datasets. The resulting scalability and usability issues have limited the applicability of provenance. We present PUG, a system for why and why-not provenance that applies a range of novel techniques to overcome these challenges. Specifically, PUG limits provenance capture to what is relevant to explain a (missing) result of interest and uses an efficient sampling-based summarization method to produce compact explanations for (missing) answers. Using two real-world datasets, we demonstrate how a user can draw meaningful insights from explanations produced by PUG.
UR - http://www.scopus.com/inward/record.url?scp=85058882527&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85058882527&partnerID=8YFLogxK
U2 - 10.14778/3229863.3236233
DO - 10.14778/3229863.3236233
M3 - Conference article
AN - SCOPUS:85058882527
VL - 11
SP - 1954
EP - 1957
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
SN - 2150-8097
IS - 12
ER -