LPTA: A probabilistic model for latent periodic topic analysis

Zhijun Yin, Liangliang Cao, Jiawei Han, Chengxiang Zhai, Thomas Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper studies the problem of latent periodic topic analysis from timestamped documents. The examples of timestamped documents include news articles, sales records, financial reports, TV programs, and more recently, posts from social media websites such as Flickr, Twitter, and Facebook. Different from detecting periodic patterns in traditional time series database, we discover the topics of coherent semantics and periodic characteristics where a topic is represented by a distribution of words. We propose a model called LPTA (Latent Periodic Topic Analysis) that exploits the periodicity of the terms as well as term co-occurrences. To show the effectiveness of our model, we collect several representative datasets including Seminar, DBLP and Flickr. The results show that our model can discover the latent periodic topics effectively and leverage the information from both text and time well.

Original languageEnglish (US)
Title of host publicationProceedings - 11th IEEE International Conference on Data Mining, ICDM 2011
Pages904-913
Number of pages10
DOIs
StatePublished - 2011
Event11th IEEE International Conference on Data Mining, ICDM 2011 - Vancouver, BC, Canada
Duration: Dec 11 2011Dec 14 2011

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Other

Other11th IEEE International Conference on Data Mining, ICDM 2011
Country/TerritoryCanada
CityVancouver, BC
Period12/11/1112/14/11

Keywords

  • Periodic topics
  • Topic modeling

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint

Dive into the research topics of 'LPTA: A probabilistic model for latent periodic topic analysis'. Together they form a unique fingerprint.

Cite this