TY - JOUR
T1 - I've seen "enough"
T2 - 43rd International Conference on Very Large Data Bases, VLDB 2017
AU - Rahman, Sajjadur
AU - Aliakbarpour, Maryam
AU - Kong, Ha Kyung
AU - Blais, Eric
AU - Karahalios, Karrie
AU - Parameswaran, Aditya
AU - Rubinfield, Ronitt
N1 - We acknowledge support from grants IIS-1513407, IIS-1633755, IIS-1652750, CCF-1029679, CCF-1420692, and CCF-1650733 awarded by the NSF, grants 1U54GM114838 and 3U54EB020406-02S1 awarded by the NIH BD2K Initiative, a Discovery Grant awarded by NSERC, grant 1536/14 awarded by ISF, and funds from Adobe, Google, and the Siebel Energy Institute.
PY - 2017/8/1
Y1 - 2017/8/1
N2 - Data visualization is an effective mechanism for identifying trends, insights, and anomalies in data. On large datasets, however, generating visualizations can take a long time, delaying the extraction of insights, hampering decision making, and reducing exploration time. One solution is to use online sampling-based schemes to generate visualizations faster while improving the displayed estimates incrementally, eventually converging to the exact visualization computed on the entire data. However, the intermediate visualizations are approximate, and often fluctuate drastically, leading to potentially incorrect decisions. We propose sampling-based incremental visualization algorithms that reveal the "salient" features of the visualization quickly-with a 46× speedup relative to baselines-while minimizing error, thus enabling rapid and errorfree decision making. We demonstrate that these algorithms are optimal in terms of sample complexity, in that given the level of interactivity, they generate approximations that take as few samples as possible. We have developed the algorithms in the context of an incremental visualization tool, titled INCVISAGE, for trendline and heatmap visualizations. We evaluate the usability of INCVISAGE via user studies and demonstrate that users are able to make effective decisions with incrementally improving visualizations, especially compared to vanilla online-sampling based schemes.
AB - Data visualization is an effective mechanism for identifying trends, insights, and anomalies in data. On large datasets, however, generating visualizations can take a long time, delaying the extraction of insights, hampering decision making, and reducing exploration time. One solution is to use online sampling-based schemes to generate visualizations faster while improving the displayed estimates incrementally, eventually converging to the exact visualization computed on the entire data. However, the intermediate visualizations are approximate, and often fluctuate drastically, leading to potentially incorrect decisions. We propose sampling-based incremental visualization algorithms that reveal the "salient" features of the visualization quickly-with a 46× speedup relative to baselines-while minimizing error, thus enabling rapid and errorfree decision making. We demonstrate that these algorithms are optimal in terms of sample complexity, in that given the level of interactivity, they generate approximations that take as few samples as possible. We have developed the algorithms in the context of an incremental visualization tool, titled INCVISAGE, for trendline and heatmap visualizations. We evaluate the usability of INCVISAGE via user studies and demonstrate that users are able to make effective decisions with incrementally improving visualizations, especially compared to vanilla online-sampling based schemes.
UR - http://www.scopus.com/inward/record.url?scp=85037042988&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85037042988&partnerID=8YFLogxK
U2 - 10.14778/3137628.3137637
DO - 10.14778/3137628.3137637
M3 - Conference article
AN - SCOPUS:85037042988
SN - 2150-8097
VL - 10
SP - 1262
EP - 1273
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 11
Y2 - 28 August 2017 through 1 September 2017
ER -