Faster Maximum Inner Product Search in High Dimensions

Mo Tiwari, Ryan Kang, Je Yong Lee, Donghyun Lee, Chris Piech, Sebastian Thrun, Ilan Shomorony, Martin Jinye Zhang

Research output: Contribution to journalConference articlepeer-review

Abstract

Maximum Inner Product Search (MIPS) is a ubiquitous task in machine learning applications. Given a query vector and n other vectors in d dimensions, the MIPS problem is to find the atom that has the highest inner product with the query vector. Existing MIPS algorithms scale at least as O(d) with respect to d, which becomes computationally prohibitive in high-dimensional settings. In this work, we present BanditMIPS, a novel randomized algorithm that provably improves the state-of-the-art complexity from O(d) to O(1) with respect to d. We validate the scaling of BanditMIPS and demonstrate that BanditMIPS outperforms prior state-of-the-art MIPS algorithms in sample complexity, wall-clock time, and precision/speedup tradeoff across a variety of experimental settings. Furthermore, we propose a variant of our algorithm, named BanditMIPSα, which improves upon BanditMIPS by employing non-uniform sampling across coordinates. We also demonstrate the usefulness of BanditMIPS in problems for which MIPS is a subroutine, including Matching Pursuit and Fourier analysis. Finally, we demonstrate that BanditMIPS can be used in conjunction with preprocessing techniques to improve its complexity with respect to n. All of our experimental results are reproducible via a 1-line script at github.com/ThrunGroup/BanditMIPS.

Original languageEnglish (US)
Pages (from-to)48344-48361
Number of pages18
JournalProceedings of Machine Learning Research
Volume235
StatePublished - 2024
Event41st International Conference on Machine Learning, ICML 2024 - Vienna, Austria
Duration: Jul 21 2024Jul 27 2024

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Faster Maximum Inner Product Search in High Dimensions'. Together they form a unique fingerprint.

Cite this