TY - GEN
T1 - Demonstration of VerdictDB, the platform-independent AQP system
AU - He, Wen
AU - Park, Yongjoo
AU - Hanafi, Idris
AU - Yatvitskiy, Jacob
AU - Mozafari, Barzan
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2018/5/27
Y1 - 2018/5/27
N2 - We demonstrate VerdictDB, the first platform-independent approximate query processing (AQP) system. Unlike existing AQP systems that are tightly-integrated into a specific database, VerdictDB operates at the driver-level, acting as a middleware between users and off-the-shelf database systems. In other words, VerdictDB requires no modifications to the database internals; it simply relies on rewriting incoming queries such that the standard execution of the rewritten queries under relational semantics yields approximate answers to the original queries. VerdictDB exploits a novel technique for error estimation called variational subsampling, which is amenable to efficient computation via SQL. In this demonstration, we showcase VerdictDB's performance benefits (up to two orders of magnitude) compared to the queries that are issued directly to existing query engines. We also illustrate that the approximate answers returned by VerdictDB are nearly identical to the exact answers. We use Apache Spark SQL and Amazon Redshift as two examples of modern distributed query platforms. We allow the audience to explore VerdictDB using a web-based interface (e.g., Hue or Apache Zeppelin) to issue queries and visualize their answers. VerdictDB is currently open-sourced and available under Apache License (V2).
AB - We demonstrate VerdictDB, the first platform-independent approximate query processing (AQP) system. Unlike existing AQP systems that are tightly-integrated into a specific database, VerdictDB operates at the driver-level, acting as a middleware between users and off-the-shelf database systems. In other words, VerdictDB requires no modifications to the database internals; it simply relies on rewriting incoming queries such that the standard execution of the rewritten queries under relational semantics yields approximate answers to the original queries. VerdictDB exploits a novel technique for error estimation called variational subsampling, which is amenable to efficient computation via SQL. In this demonstration, we showcase VerdictDB's performance benefits (up to two orders of magnitude) compared to the queries that are issued directly to existing query engines. We also illustrate that the approximate answers returned by VerdictDB are nearly identical to the exact answers. We use Apache Spark SQL and Amazon Redshift as two examples of modern distributed query platforms. We allow the audience to explore VerdictDB using a web-based interface (e.g., Hue or Apache Zeppelin) to issue queries and visualize their answers. VerdictDB is currently open-sourced and available under Apache License (V2).
KW - Approximate query processing
KW - Data analytics
UR - http://www.scopus.com/inward/record.url?scp=85048830008&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85048830008&partnerID=8YFLogxK
U2 - 10.1145/3183713.3193538
DO - 10.1145/3183713.3193538
M3 - Conference contribution
AN - SCOPUS:85048830008
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1665
EP - 1668
BT - SIGMOD 2018 - Proceedings of the 2018 International Conference on Management of Data
A2 - Das, Gautam
A2 - Jermaine, Christopher
A2 - Eldawy, Ahmed
A2 - Bernstein, Philip
PB - Association for Computing Machinery
T2 - 44th ACM SIGMOD International Conference on Management of Data, SIGMOD 2018
Y2 - 10 June 2018 through 15 June 2018
ER -