Language model information retrieval with document expansion

Tao Tao, Xuanhui Wang, Qiaozhu Mei, Cheng Xiang Zhai

Research output: Contribution to conferencePaper

Abstract

Language model information retrieval depends on accurate estimation of document models. In this paper, we propose a document expansion technique to deal with the problem of insufficient sampling of documents. We construct a probabilistic neighborhood for each document, and expand the document with its neighborhood information. The expanded document provides a more accurate estimation of the document model, thus improves retrieval accuracy. Moreover, since document expansion and pseudo feedback exploit different corpus structures, they can be combined to further improve performance. The experiment results on several different data sets demonstrate the effectiveness of the proposed document expansion method.

Original languageEnglish (US)
Pages407-414
Number of pages8
StatePublished - Dec 1 2006
Event2006 Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting, HLT-NAACL 2006 - New York, NY, United States
Duration: Jun 4 2006Jun 9 2006

Other

Other2006 Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting, HLT-NAACL 2006
CountryUnited States
CityNew York, NY
Period6/4/066/9/06

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Language model information retrieval with document expansion'. Together they form a unique fingerprint.

  • Cite this

    Tao, T., Wang, X., Mei, Q., & Zhai, C. X. (2006). Language model information retrieval with document expansion. 407-414. Paper presented at 2006 Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting, HLT-NAACL 2006, New York, NY, United States.