Constructing topical hierarchies in heterogeneous information networks

Chi Wang, Marina Danilevsky, Jialu Liu, Nihit Desai, Heng Ji, Jiawei Han

Research output: Contribution to journalConference article

Abstract

A digital data collection (e.g., scientific publications, enterprise reports, news, and social media) can often be modeled as a heterogeneous information network, linking text with multiple types of entities. Constructing high-quality concept hierarchies that can represent topics at multiple granularities benefits tasks such as search, information browsing, and pattern mining. In this work we present an algorithm for recursively constructing multi-typed topical hierarchies. Contrary to traditional text-based topic modeling, our approach handles both textual phrases and multiple types of entities by a newly designed clustering and ranking algorithm for heterogeneous network data, as well as mining and ranking topical patterns of different types. Our experiments on datasets from two different domains demonstrate that our algorithm yields high quality, multi-typed topical hierarchies.

Original languageEnglish (US)
Article number6729561
Pages (from-to)767-776
Number of pages10
JournalProceedings - IEEE International Conference on Data Mining, ICDM
DOIs
StatePublished - Dec 1 2013
Event13th IEEE International Conference on Data Mining, ICDM 2013 - Dallas, TX, United States
Duration: Dec 7 2013Dec 10 2013

Fingerprint

Heterogeneous networks
Data mining
Industry
Experiments

Keywords

  • heterogeneous network
  • information network
  • topic hierarchy

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Constructing topical hierarchies in heterogeneous information networks. / Wang, Chi; Danilevsky, Marina; Liu, Jialu; Desai, Nihit; Ji, Heng; Han, Jiawei.

In: Proceedings - IEEE International Conference on Data Mining, ICDM, 01.12.2013, p. 767-776.

Research output: Contribution to journalConference article

@article{54e85cea71f246d99a4154dc43418725,
title = "Constructing topical hierarchies in heterogeneous information networks",
abstract = "A digital data collection (e.g., scientific publications, enterprise reports, news, and social media) can often be modeled as a heterogeneous information network, linking text with multiple types of entities. Constructing high-quality concept hierarchies that can represent topics at multiple granularities benefits tasks such as search, information browsing, and pattern mining. In this work we present an algorithm for recursively constructing multi-typed topical hierarchies. Contrary to traditional text-based topic modeling, our approach handles both textual phrases and multiple types of entities by a newly designed clustering and ranking algorithm for heterogeneous network data, as well as mining and ranking topical patterns of different types. Our experiments on datasets from two different domains demonstrate that our algorithm yields high quality, multi-typed topical hierarchies.",
keywords = "heterogeneous network, information network, topic hierarchy",
author = "Chi Wang and Marina Danilevsky and Jialu Liu and Nihit Desai and Heng Ji and Jiawei Han",
year = "2013",
month = "12",
day = "1",
doi = "10.1109/ICDM.2013.53",
language = "English (US)",
pages = "767--776",
journal = "Proceedings - IEEE International Conference on Data Mining, ICDM",
issn = "1550-4786",

}

TY - JOUR

T1 - Constructing topical hierarchies in heterogeneous information networks

AU - Wang, Chi

AU - Danilevsky, Marina

AU - Liu, Jialu

AU - Desai, Nihit

AU - Ji, Heng

AU - Han, Jiawei

PY - 2013/12/1

Y1 - 2013/12/1

N2 - A digital data collection (e.g., scientific publications, enterprise reports, news, and social media) can often be modeled as a heterogeneous information network, linking text with multiple types of entities. Constructing high-quality concept hierarchies that can represent topics at multiple granularities benefits tasks such as search, information browsing, and pattern mining. In this work we present an algorithm for recursively constructing multi-typed topical hierarchies. Contrary to traditional text-based topic modeling, our approach handles both textual phrases and multiple types of entities by a newly designed clustering and ranking algorithm for heterogeneous network data, as well as mining and ranking topical patterns of different types. Our experiments on datasets from two different domains demonstrate that our algorithm yields high quality, multi-typed topical hierarchies.

AB - A digital data collection (e.g., scientific publications, enterprise reports, news, and social media) can often be modeled as a heterogeneous information network, linking text with multiple types of entities. Constructing high-quality concept hierarchies that can represent topics at multiple granularities benefits tasks such as search, information browsing, and pattern mining. In this work we present an algorithm for recursively constructing multi-typed topical hierarchies. Contrary to traditional text-based topic modeling, our approach handles both textual phrases and multiple types of entities by a newly designed clustering and ranking algorithm for heterogeneous network data, as well as mining and ranking topical patterns of different types. Our experiments on datasets from two different domains demonstrate that our algorithm yields high quality, multi-typed topical hierarchies.

KW - heterogeneous network

KW - information network

KW - topic hierarchy

UR - http://www.scopus.com/inward/record.url?scp=84894658486&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84894658486&partnerID=8YFLogxK

U2 - 10.1109/ICDM.2013.53

DO - 10.1109/ICDM.2013.53

M3 - Conference article

AN - SCOPUS:84894658486

SP - 767

EP - 776

JO - Proceedings - IEEE International Conference on Data Mining, ICDM

JF - Proceedings - IEEE International Conference on Data Mining, ICDM

SN - 1550-4786

M1 - 6729561

ER -