TY - GEN
T1 - Model-averaged latent semantic indexing
AU - Efron, Miles
PY - 2007
Y1 - 2007
N2 - This poster introduces a novel approach to information retrieval that uses statistical model averaging to improve latent semantic indexing (LSI). Instead of choosing a single dimensionality $k$ for LSI , we propose using several models of differing dimensionality to inform retrieval. To manage this ensemble we weight each model's contribution to an extent inversely proportional to its AIC (Akaike information criterion). Thus each model contributes proportionally to its expected Kullback-Leibler divergence from the distribution that generated the data. We present results on three standard IR test collections, demonstrating significant improvement over both the traditional vector space model and single-model LSI.
AB - This poster introduces a novel approach to information retrieval that uses statistical model averaging to improve latent semantic indexing (LSI). Instead of choosing a single dimensionality $k$ for LSI , we propose using several models of differing dimensionality to inform retrieval. To manage this ensemble we weight each model's contribution to an extent inversely proportional to its AIC (Akaike information criterion). Thus each model contributes proportionally to its expected Kullback-Leibler divergence from the distribution that generated the data. We present results on three standard IR test collections, demonstrating significant improvement over both the traditional vector space model and single-model LSI.
KW - Latent semantic indexing
KW - Model averaging
KW - Model selection
UR - http://www.scopus.com/inward/record.url?scp=36448930842&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=36448930842&partnerID=8YFLogxK
U2 - 10.1145/1277741.1277893
DO - 10.1145/1277741.1277893
M3 - Conference contribution
AN - SCOPUS:36448930842
SN - 1595935975
SN - 9781595935977
T3 - Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07
SP - 755
EP - 756
BT - Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07
T2 - 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07
Y2 - 23 July 2007 through 27 July 2007
ER -