Mining Query-Based Subnetwork Outliers in Heterogeneous Information Networks

Honglei Zhuang, Jing Zhang, George Brova, Jie Tang, Hasan Cam, Xifeng Yan, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Mining outliers in a heterogeneous information network is a challenging problem: It is even unclear what should be outliers in a large heterogeneous network (e.g., Outliers in the entire bibliographic network consisting of authors, titles, papers and venues). In this study, we propose an interesting class of outliers, query-based sub network outliers: Given a heterogeneous network, a user raises a query to retrieve a set of task-relevant sub networks, among which, sub network outliers are those that significantly deviate from others (e.g., Outliers of author groups among those studying 'topic modeling'). We formalize this problem and propose a general framework, where one can query for finding sub network outliers with respect to different semantics. We introduce the notion of sub network similarity that captures the proximity between two sub networks by their membership distributions. We propose an outlier detection algorithm to rank all the sub networks according to their outlierness without tuning parameters. Our quantitative and qualitative experiments on both synthetic and real data sets show that the proposed method outperforms other baselines.

Original languageEnglish (US)
Title of host publicationProceedings - 14th IEEE International Conference on Data Mining, ICDM 2014
EditorsRavi Kumar, Hannu Toivonen, Jian Pei, Joshua Zhexue Huang, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1127-1132
Number of pages6
EditionJanuary
ISBN (Electronic)9781479943029
DOIs
StatePublished - Jan 1 2014
Event14th IEEE International Conference on Data Mining, ICDM 2014 - Shenzhen, China
Duration: Dec 14 2014Dec 17 2014

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
NumberJanuary
Volume2015-January
ISSN (Print)1550-4786

Other

Other14th IEEE International Conference on Data Mining, ICDM 2014
Country/TerritoryChina
CityShenzhen
Period12/14/1412/17/14

Keywords

  • heterogeneous information network
  • outlier detection
  • query-based

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Mining Query-Based Subnetwork Outliers in Heterogeneous Information Networks'. Together they form a unique fingerprint.

Cite this