TY - GEN
T1 - A study of Poisson query generation model for information retrieval
AU - Mei, Qiaozhu
AU - Fang, Hui
AU - Zhai, Chengxiang
PY - 2007
Y1 - 2007
N2 - Many variants of language models have been proposed for information retrieval. Most existing models are based on multinomial distribution and would score documents based on query likelihood computed based on a query generation probabilistic model. In this paper, we propose and study a new family of query generation models based on Poisson distribution. We show that while in their simplest forms, the new family of models and the existing multinomial models are equivalent. However, based on different smoothing methods, the two families of models behave differently. We show that the Poisson model has several advantages, including naturally accommodating per-term smoothing and modeling accurate background more efficiently. We present several variants of the new model corresponding to different smoothing methods, and evaluate them on four representative TREC test collections. The results show that while their basic models perform comparably, the Poisson model can out perform multinomial model with per-term smoothing. The performance can be further improved with two-stage smoothing.
AB - Many variants of language models have been proposed for information retrieval. Most existing models are based on multinomial distribution and would score documents based on query likelihood computed based on a query generation probabilistic model. In this paper, we propose and study a new family of query generation models based on Poisson distribution. We show that while in their simplest forms, the new family of models and the existing multinomial models are equivalent. However, based on different smoothing methods, the two families of models behave differently. We show that the Poisson model has several advantages, including naturally accommodating per-term smoothing and modeling accurate background more efficiently. We present several variants of the new model corresponding to different smoothing methods, and evaluate them on four representative TREC test collections. The results show that while their basic models perform comparably, the Poisson model can out perform multinomial model with per-term smoothing. The performance can be further improved with two-stage smoothing.
KW - Formal models
KW - Poisson process
KW - Query generation
KW - Term dependent smoothing
UR - http://www.scopus.com/inward/record.url?scp=36448955332&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=36448955332&partnerID=8YFLogxK
U2 - 10.1145/1277741.1277797
DO - 10.1145/1277741.1277797
M3 - Conference contribution
AN - SCOPUS:36448955332
SN - 1595935975
SN - 9781595935977
T3 - Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07
SP - 319
EP - 326
BT - Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07
T2 - 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07
Y2 - 23 July 2007 through 27 July 2007
ER -