Blazelt: Optimizing declarative aggregation and limit queries for neural networkbased video analytics

Daniel Kang, Peter Bailis, Matei Zaharia

Research output: Contribution to journalConference articlepeer-review

Abstract

Recent advances in neural networks (NNs) have enabled automatic querying of large volumes of video data with high accuracy. While these deep NNs can produce accurate annotations of an object's position and type in video, they are computationally expensive and require complex, imperative deployment code to answer queries. Prior work uses approximate filtering to reduce the cost of video analytics, but does not handle two important classes of queries, aggregation and limit queries; moreover, these approaches still require complex code to deploy. To address the computational and usability challenges of querying video at scale, we introduce BLAZEIT, a system that optimizes queries of spatiotemporal information of objects in video. BLAZEIT accepts queries via FRAMEQL, a declarative extension of SQL for video analytics that enables video-specific query optimization. We introduce two new query optimization techniques in BLAZEIT that are not supported by prior work. First, we develop methods of using NNs as control variates to quickly answer approximate aggregation queries with error bounds. Second, we present a novel search algorithm for cardinality-limited video queries. Through these these optimizations, BLAZEIT can deliver up to 83x speedups over the recent literature on video processing.

Original languageEnglish (US)
Pages (from-to)533-546
Number of pages14
JournalProceedings of the VLDB Endowment
Volume13
Issue number4
DOIs
StatePublished - Dec 9 2019
Externally publishedYes
Event46th International Conference on Very Large Data Bases, VLDB 2020 - Tokyo, Japan
Duration: Aug 31 2020Sep 4 2020

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • General Computer Science

Fingerprint

Dive into the research topics of 'Blazelt: Optimizing declarative aggregation and limit queries for neural networkbased video analytics'. Together they form a unique fingerprint.

Cite this