Realtime query completion via deep language models

Po Wei Wang, Huan Zhang, Vijai Mohan, Inderjit S. Dhillon, J. Zico Kolter

Research output: Contribution to journalConference articlepeer-review


Search engine users nowadays heavily depend on query completion and correction to shape their queries. Typically, the completion is done by database lookup which does not understand the context and cannot generalize to prefixes not in the database. In this paper, we propose to use unsupervised deep language models to complete and correct the queries given an arbitrary prefix. We address two main challenges that renders this method practical for large-scale deployment: 1) we propose a modified beam search process which integrates with a completion distance based error correction model, combining the error correction process (as a potential function) together with the language model; and 2) we show how to efficiently perform our modified beam search process on CPU to complete the queries with error correction in real time, by exploiting the greatly overlapped forward propagation process and conducting amortized dynamic programming on the search tree, along with both SIMD-level and thread level parallelism. We outperform the off-the-shelf Keras implementation by a factor of 50, thus allowing us to generate query suggestions in real time (generating top 16 completions within 16 ms). Experiments on two large scale datasets from AOL and show that the method substantially increases hit rate over standard approaches, reduces the memory footprint of database lookup based approach by over two orders of magnitude, and is capable of handling tail queries.

Original languageEnglish (US)
JournalCEUR Workshop Proceedings
StatePublished - 2018
Externally publishedYes
Event2018 SIGIR Workshop On eCommerce, eCom 2018 - Ann Arbor, United States
Duration: Jul 12 2018 → …


  • Deep learning
  • Query completion
  • Query correction
  • Realtime

ASJC Scopus subject areas

  • General Computer Science


Dive into the research topics of 'Realtime query completion via deep language models'. Together they form a unique fingerprint.

Cite this