Statistical translation language model for twitter search

Maryam Karimzadehgan, Cheng Xiang Zhai, Miles Efron

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the prevalence of social media applications, an increasing number of internet users are actively publishing text information on-line. This influx provides a wealth of text information on those users. Ranking in social media poses different challenges than Web search ranking, one of which is that Microblog messages are really short. As a result, the vocabulary mismatch problem is exacerbated in social media search. In this paper, we first study the standard translation model for this problem and reveal that translation language model not only helps to bridge the vocabulary gap but also improves the estimate of Term Frequency. We further propose two ways to improve translation language model through leveraging Hashtag information and adaptively setting the self-translation parameter. Experimental results on Twitter data set show that our proposed methods are effective.

Original languageEnglish (US)
Title of host publicationInternational Conference on the Theory of Information Retrieval, ICTIR 2013 Proceedings
Pages121-124
Number of pages4
DOIs
StatePublished - 2013
Event4th International Conference on the Theory of Information Retrieval, ICTIR 2013 - Copenhagen, Denmark
Duration: Sep 29 2013Oct 2 2013

Publication series

NameACM International Conference Proceeding Series

Other

Other4th International Conference on the Theory of Information Retrieval, ICTIR 2013
Country/TerritoryDenmark
CityCopenhagen
Period9/29/1310/2/13

Keywords

  • Hashtag
  • Statistical machine translation
  • Twitter

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Statistical translation language model for twitter search'. Together they form a unique fingerprint.

Cite this