Mining heterogeneous information networks: The next frontier

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Real world physical and abstract data objects are interconnected, forming gigantic, interconnected networks. By structuring these data objects into multiple types, such networks become semi-structured heterogeneous information networks. Most real world applications that handle big data, including interconnected social media and social networks, scientific, engineering, or medical information systems, online e-commerce systems, and most database systems, can be structured into heterogeneous information networks. For example, in a medical care network, objects of multiple types, such as patients, doctors, diseases, medication, and links such as visits, diagnosis, and treatments are intertwined together, providing rich information and forming heterogeneous information networks. Effective analysis of large-scale heterogeneous information networks poses an interesting but critical challenge. In this talk, we present a set of data mining scenarios in heterogeneous information networks and show that mining heterogeneous information networks is a new and promising research frontier in data mining research. Departing from many existing network models that view data as homogeneous graphs or networks, the semi-structured heterogeneous information network model leverages the rich semantics of typed nodes and links in a network and can uncover surprisingly rich knowledge from interconnected data. This heterogeneous network modeling will lead to the discovery of a set of new principles and methodologies for mining interconnected data. The examples to be used in this discussion include (1) meta path-based similarity search, (2) rank-based clustering, (3) rank-based classification, (4) meta path-based link/relationship prediction, (5) relation strength-aware mining, as well as a few other recent developments. We will also point out some promising research directions and provide convincing arguments on that mining heterogeneous information networks is the next frontier in data mining.

Original languageEnglish (US)
Title of host publicationKDD'12 - 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Pages2-3
Number of pages2
DOIs
StatePublished - Sep 14 2012
Event18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012 - Beijing, China
Duration: Aug 12 2012Aug 16 2012

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Other

Other18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012
CountryChina
CityBeijing
Period8/12/128/16/12

Keywords

  • data mining
  • heterogeneous information networks

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint Dive into the research topics of 'Mining heterogeneous information networks: The next frontier'. Together they form a unique fingerprint.

Cite this