Exploiting thread structures to improve smoothing of language models for forum post retrieval

Huizhong Duan, Chengxiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Due to many unique characteristics of forum data, forum post retrieval is different from traditional document retrieval and web search, raising interesting research questions about how to optimize the accuracy of forum post retrieval. In this paper, we study how to exploit the naturally available raw thread structures of forums to improve retrieval accuracy in the language modeling framework. Specifically, we propose and study two different schemes for smoothing the language model of a forum post based on the thread containing the post. We explore several different variants of the two schemes to exploit thread structures in different ways. We also create a human annotated test data set for forum post retrieval and evaluate the proposed smoothing methods using this data set. The experiment results show that the proposed methods for leveraging forum threads to improve estimation of document language models are effective, and they outperform the existing smoothing methods for the forum post retrieval task.

Original languageEnglish (US)
Title of host publicationAdvances in Information Retrieval - 33rd European Conference on IR Research, ECIR 2011, Proceedings
EditorsPaul Clough, Colum Foley, Cathal Gurrin, Hyowon Lee, Gareth J.F. Jones, Wessel Kraaij, Vanessa Murdoch
PublisherSpringer
Pages350-361
Number of pages12
ISBN (Print)9783642201608
DOIs
StatePublished - 2011
Event33rd European Conference on Information Retrieval, ECIR 2011 - Dublin, Ireland
Duration: Apr 18 2011Apr 21 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6611 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other33rd European Conference on Information Retrieval, ECIR 2011
Country/TerritoryIreland
CityDublin
Period4/18/114/21/11

Keywords

  • Forum post retrieval
  • Language modeling
  • Smoothing

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Exploiting thread structures to improve smoothing of language models for forum post retrieval'. Together they form a unique fingerprint.

Cite this