Axiomatic analysis of smoothing methods in language models for Pseudo-Relevance Feedback

Hussein Hazimeh, Cheng Xiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Pseudo-Relevance Feedback (PRF) is an important general technique for improving retrieval effectiveness without requiring any user effort. Several state-of-the-art PRF models are based on the language modeling approach where a query language model is learned based on feedback documents. In all these models, feedback documents are represented with unigram language models smoothed with a collection language model. While collection language model-based smoothing has proven both effective and necessary in using language models for retrieval, we use axiomatic analysis to show that this smoothing scheme inherently causes the feedback model to favor frequent terms and thus violates the IDF constraint needed to ensure selection of discriminative feedback terms. To address this problem, we propose replacing collection language model-based smoothing in the feedback stage with additive smoothing, which is analytically shown to select more discriminative terms. Empirical evaluation further confirms that additive smoothing indeed significantly outperforms collection-based smoothing methods in multiple language model-based PRF models.

Original languageEnglish (US)
Title of host publicationICTIR 2015 - Proceedings of the 2015 ACM SIGIR International Conference on the Theory of Information Retrieval
PublisherAssociation for Computing Machinery
Pages141-150
Number of pages10
ISBN (Electronic)9781450338332
DOIs
StatePublished - Sep 27 2015
Event5th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2015 - Northampton, United States
Duration: Sep 27 2015Sep 30 2015

Publication series

NameICTIR 2015 - Proceedings of the 2015 ACM SIGIR International Conference on the Theory of Information Retrieval

Other

Other5th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2015
Country/TerritoryUnited States
CityNorthampton
Period9/27/159/30/15

Keywords

  • Divergence minimization model
  • Geometric relevance model
  • Pseudo relevance feedback
  • Relevance model
  • Smoothing

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Information Systems

Fingerprint

Dive into the research topics of 'Axiomatic analysis of smoothing methods in language models for Pseudo-Relevance Feedback'. Together they form a unique fingerprint.

Cite this