Integrating meta-path selection with user-guided object clustering in heterogeneous information networks

Yizhou Sun, Brandon Norick, Jiawei Han, Xifeng Yan, Philip S. Yu, Xiao Yu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Real-world, multiple-typed objects are often interconnected, forming heterogeneous information networks. A major challenge for link-based clustering in such networks is its potential to generate many different results, carrying rather diverse semantic meanings. In order to generate desired clustering, we propose to use meta-path, a path that connects object types via a sequence of relations, to control clustering with distinct semantics. Nevertheless, it is easier for a user to provide a few examples ("seeds") than a weighted combination of sophisticated meta-paths to specify her clustering preference. Thus, we propose to integrate meta-path selection with user-guided clustering to cluster objects in networks, where a user first provides a small set of object seeds for each cluster as guidance. Then the system learns the weights for each meta-path that are consistent with the clustering result implied by the guidance, and generates clusters under the learned weights of meta-paths. A probabilistic approach is proposed to solve the problem, and an effective and efficient iterative algorithm, PathSelClus, is proposed to learn the model, where the clustering quality and the meta-path weights are mutually enhancing each other. Our experiments with several clustering tasks in two real networks demonstrate the power of the algorithm in comparison with the baselines.

Original languageEnglish (US)
Title of host publicationKDD'12 - 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Pages1348-1356
Number of pages9
DOIs
StatePublished - Sep 14 2012
Event18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012 - Beijing, China
Duration: Aug 12 2012Aug 16 2012

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Other

Other18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012
Country/TerritoryChina
CityBeijing
Period8/12/128/16/12

Keywords

  • heterogeneous information networks
  • meta-path selection
  • user guided clustering

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Integrating meta-path selection with user-guided object clustering in heterogeneous information networks'. Together they form a unique fingerprint.

Cite this