Probabilistic topic models with biased propagation on heterogeneous information networks

Hongbo Deng, Jiawei Han, Bo Zhao, Yintao Yu, Cindy Xide Lin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the development of Web applications, textual documents are not only getting richer, but also ubiquitously interconnected with users and other objects in various ways, which brings about text-rich heterogeneous information networks. Topic models have been proposed and shown to be useful for document analysis, and the interactions among multi-typed objects play a key role at disclosing the rich semantics of the network. However, most of topic models only consider the textual information while ignore the network structures or can merely integrate with homogeneous networks. None of them can handle heterogeneous information network well. In this paper, we propose a novel topic model with biased propagation (TMBP) algorithm to directly incorporate heterogeneous information network with topic modeling in a unified way. The underlying intuition is that multi-typed objects should be treated differently along with their inherent textual information and the rich semantics of the heterogeneous information network. A simple and unbiased topic propagation across such a heterogeneous network does not make much sense. Consequently, we investigate and develop two biased propagation frameworks, the biased random walk framework and the biased regularization framework, for the TMBP algorithm from different perspectives, which can discover latent topics and identify clusters of multi-typed objects simultaneously. We extensively evaluate the proposed approach and compare to the state-of-the-art techniques on several datasets. Experimental results demonstrate that the improvement in our proposed approach is consistent and promising.

Original languageEnglish (US)
Title of host publicationProceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD'11
PublisherAssociation for Computing Machinery
Pages1271-1279
Number of pages9
ISBN (Print)9781450308137
DOIs
StatePublished - 2011
Event17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011 - San Diego, United States
Duration: Aug 21 2011Aug 24 2011

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

Conference17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011
Country/TerritoryUnited States
CitySan Diego
Period8/21/118/24/11

Keywords

  • Biased propagation
  • Clustering
  • Heterogeneous information network
  • Topic modeling

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Probabilistic topic models with biased propagation on heterogeneous information networks'. Together they form a unique fingerprint.

Cite this