Classifying search queries using the Web as a source of knowledge

Evgeniy Gabrilovich, Andrei Broder, Marcus Fontoura, Amruta Joshi, Vanja Josifovski, Lance Riedel, Tong Zhang

Research output: Contribution to journalArticlepeer-review


We propose a methodology for building a robust query classification system that can identify thousands of query classes, while dealing in real time with the query volume of a commercial Web search engine. We use a pseudo relevance feedback technique: given a query, we determine its topic by classifying the Web search results retrieved by the query. Motivated by the needs of search advertising, we primarily focus on rare queries, which are the hardest from the point of view of machine learning, yet in aggregate account for a considerable fraction of search engine traffic. Empirical evaluation confirms that our methodology yields a considerably higher classification accuracy than previously reported. We believe that the proposed methodology will lead to better matching of online ads to rare queries and overall to a better user experience.

Original languageEnglish (US)
Article number5
JournalACM Transactions on the Web
Issue number2
StatePublished - Apr 1 2009
Externally publishedYes


  • Pseudo relevance feedback
  • Query classification
  • Web search

ASJC Scopus subject areas

  • Computer Networks and Communications


Dive into the research topics of 'Classifying search queries using the Web as a source of knowledge'. Together they form a unique fingerprint.

Cite this