TY - GEN
T1 - Revisiting the divergence minimization feedback model
AU - Lv, Yuanhua
AU - Zhai, Cheng Xiang
N1 - Publisher Copyright:
Copyright 2014 ACM.
PY - 2014/11/3
Y1 - 2014/11/3
N2 - Pseudo-relevance feedback (PRF) has proven to be an effective strategy for improving retrieval accuracy. In this paper, we revisit a PRF method based on statistical language models, namely the divergence minimization model (DMM). DMM not only has apparently sound theoretical foundation, but also has been shown to satisfy most of the retrieval constraints. However, it turns out to perform surprisingly poorly in many previous experiments. We investigate the cause, and reveal that DMM inappropriately tackles the entropy of the feedback model, which generates highly skewed feedback model. To address this problem, we propose a maximum-entropy divergence minimization model (MEDMM) by introducing an entropy term to regularize DMM. Our experiments on various TREC collections demonstrate that MEDMM not only works much better than DMM, but also outperforms several other state of the art PRF methods, especially on web collections. Moreover, unlike existing PRF models that have to be combined with the original query to perform well, MEDMM can work effectively even without being combined with the original query.
AB - Pseudo-relevance feedback (PRF) has proven to be an effective strategy for improving retrieval accuracy. In this paper, we revisit a PRF method based on statistical language models, namely the divergence minimization model (DMM). DMM not only has apparently sound theoretical foundation, but also has been shown to satisfy most of the retrieval constraints. However, it turns out to perform surprisingly poorly in many previous experiments. We investigate the cause, and reveal that DMM inappropriately tackles the entropy of the feedback model, which generates highly skewed feedback model. To address this problem, we propose a maximum-entropy divergence minimization model (MEDMM) by introducing an entropy term to regularize DMM. Our experiments on various TREC collections demonstrate that MEDMM not only works much better than DMM, but also outperforms several other state of the art PRF methods, especially on web collections. Moreover, unlike existing PRF models that have to be combined with the original query to perform well, MEDMM can work effectively even without being combined with the original query.
KW - Additive smoothing
KW - Divergence minimization
KW - Maximum entropy
KW - Query language model
UR - http://www.scopus.com/inward/record.url?scp=84937606009&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84937606009&partnerID=8YFLogxK
U2 - 10.1145/2661829.2661900
DO - 10.1145/2661829.2661900
M3 - Conference contribution
AN - SCOPUS:84937606009
T3 - CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management
SP - 1863
EP - 1866
BT - CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 23rd ACM International Conference on Information and Knowledge Management, CIKM 2014
Y2 - 3 November 2014 through 7 November 2014
ER -