Exploiting correlations for expensive predicate evaluation

Manas Joglekar, Hector Garcia-Molina, Aditya Parameswaran, Christopher Re

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

User Defined Function(UDFs) are used increasingly to augment query languages with extra, application dependent functionality. Selection queries involving UDF predicates tend to be expensive, either in terms of monetary cost or latency. In this paper, we study ways to efficiently evaluate selection queries with UDF predicates. We provide a family of techniques for processing queries at low cost while satisfying user-specified precision and recall constraints. Our techniques are applicable to a variety of scenarios including when selection probabilities of tuples are available beforehand, when this information is available but noisy, or when no such prior information is available. We also generalize our techniques to more complex queries. Finally, we test our techniques on real datasets, and show that they achieve significant savings in UDF evaluations of up to 80%, while incurring only a small reduction in accuracy.

Original languageEnglish (US)
Title of host publicationSIGMOD 2015 - Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages1183-1198
Number of pages16
ISBN (Electronic)9781450327589
DOIs
StatePublished - May 27 2015
Externally publishedYes
EventACM SIGMOD International Conference on Management of Data, SIGMOD 2015 - Melbourne, Australia
Duration: May 31 2015Jun 4 2015

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
Volume2015-May
ISSN (Print)0730-8078

Other

OtherACM SIGMOD International Conference on Management of Data, SIGMOD 2015
Country/TerritoryAustralia
CityMelbourne
Period5/31/156/4/15

Keywords

  • Approximate query processing
  • User defined functions

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Exploiting correlations for expensive predicate evaluation'. Together they form a unique fingerprint.

Cite this