A study of smoothing methods for language models applied to ad hoc information retrieval

C. Zhai, J. Lafferty

Research output: Contribution to journalConference articlepeer-review

Abstract

Language modelling approaches to information retrieval are attractive and promising because they connect the problem of retrieval with that of language model estimation, which has been studied extensively in other application areas such as speech recognition. The basic idea of these approaches is to estimate a language model for each document, and then rank documents by the likelihood of the query according to the estimated language model. A core problem in language model estimation is smoothing, which adjusts the maximum likelihood estimator so as to correct the inaccuracy due to data sparseness. In this paper, we study the problem of language model smoothing and its influence on retrieval performance. We examine the sensitivity of retrieval performance to the smoothing parameters and compare several popular smoothing methods on different test collections.

Original languageEnglish (US)
Pages (from-to)334-342
Number of pages9
JournalSIGIR Forum (ACM Special Interest Group on Information Retrieval)
DOIs
StatePublished - 2001
Externally publishedYes
Event24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - New Orleans, LA, United States
Duration: Sep 9 2001Sep 13 2001

ASJC Scopus subject areas

  • Management Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'A study of smoothing methods for language models applied to ad hoc information retrieval'. Together they form a unique fingerprint.

Cite this