Abstract
Language modelling approaches to information retrieval are attractive and promising because they connect the problem of retrieval with that of language model estimation, which has been studied extensively in other application areas such as speech recognition. The basic idea of these approaches is to estimate a language model for each document, and then rank documents by the likelihood of the query according to the estimated language model. A core problem in language model estimation is smoothing, which adjusts the maximum likelihood estimator so as to correct the inaccuracy due to data sparseness. In this paper, we study the problem of language model smoothing and its influence on retrieval performance. We examine the sensitivity of retrieval performance to the smoothing parameters and compare several popular smoothing methods on different test collections.
Original language | English (US) |
---|---|
Pages (from-to) | 334-342 |
Number of pages | 9 |
Journal | SIGIR Forum (ACM Special Interest Group on Information Retrieval) |
DOIs | |
State | Published - 2001 |
Externally published | Yes |
Event | 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - New Orleans, LA, United States Duration: Sep 9 2001 → Sep 13 2001 |
ASJC Scopus subject areas
- Management Information Systems
- Hardware and Architecture