Constructing topical hierarchies in heterogeneous information networks

Chi Wang, Jialu Liu, Nihit Desai, Marina Danilevsky, Jiawei Han

Research output: Contribution to journalArticlepeer-review

Abstract

Many digital documentary data collections (e.g., scientific publications, enterprise reports, news articles, and social media) can be modeled as a heterogeneous information network, linking text with multiple types of entities. Constructing high-quality hierarchies that can represent topics at multiple granularities benefits tasks such as search, information browsing, and pattern mining. In this work, we present an algorithm for recursively constructing multi-typed topical hierarchies. Contrary to traditional text-based topic modeling, our approach handles both textual phrases and multiple types of entities by a newly designed clustering and ranking algorithm for heterogeneous network data, as well as mining and ranking topical patterns of different types. Our experiments on datasets from two different domains demonstrate that our algorithm yields high-quality, multi-typed topical hierarchies.

Original languageEnglish (US)
Pages (from-to)529-558
Number of pages30
JournalKnowledge and Information Systems
Volume44
Issue number3
DOIs
StatePublished - Sep 17 2015

Keywords

  • Information network
  • Link mining
  • Text mining
  • Topic hierarchy
  • Topic modeling

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Human-Computer Interaction
  • Hardware and Architecture
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Constructing topical hierarchies in heterogeneous information networks'. Together they form a unique fingerprint.

Cite this