Discovering maximal motif cliques in large heterogeneous information networks

Jiafeng Hu, Reynold Cheng, Kevin Chen Chuan Chang, Aravind Sankar, Yixiang Fang, Brian Y.H. Lam

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We study the discovery of cliques (or 'complete' subgraphs) in heterogeneous information networks (HINs). Existing clique-finding solutions often ignore the rich semantics of HINs. We propose motif clique, or m-clique, which redefines subgraph completeness with respect to a given motif. A motif, essentially a small subgraph pattern, is a fundamental building block of an HIN. The m-clique concept is general and allows us to analyse 'complete' subgraphs in an HIN with respect to desired high-order connection patterns. We further investigate the maximal m-clique enumeration problem (MMCE), which finds all maximal m-cliques not contained in any other m-cliques. Because MMCE is NP-hard, developing an accurate and efficient solution for MMCE is not straightforward. We thus present the META algorithm, which employs advanced pruning strategies to effectively reduce the search space. We also design fast techniques to avoid generating duplicated maximal m-clique instances. Our extensive experiments on large real and synthetic HINs show that META is highly effective and efficient.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019
PublisherIEEE Computer Society
Pages746-757
Number of pages12
ISBN (Electronic)9781538674741
DOIs
StatePublished - Apr 2019
Event35th IEEE International Conference on Data Engineering, ICDE 2019 - Macau, China
Duration: Apr 8 2019Apr 11 2019

Publication series

NameProceedings - International Conference on Data Engineering
Volume2019-April
ISSN (Print)1084-4627

Conference

Conference35th IEEE International Conference on Data Engineering, ICDE 2019
CountryChina
CityMacau
Period4/8/194/11/19

Fingerprint

Computational complexity
Semantics
Experiments

Keywords

  • Clique
  • Heterogeneous information networks
  • Motif

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Cite this

Hu, J., Cheng, R., Chang, K. C. C., Sankar, A., Fang, Y., & Lam, B. Y. H. (2019). Discovering maximal motif cliques in large heterogeneous information networks. In Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019 (pp. 746-757). [8731437] (Proceedings - International Conference on Data Engineering; Vol. 2019-April). IEEE Computer Society. https://doi.org/10.1109/ICDE.2019.00072

Discovering maximal motif cliques in large heterogeneous information networks. / Hu, Jiafeng; Cheng, Reynold; Chang, Kevin Chen Chuan; Sankar, Aravind; Fang, Yixiang; Lam, Brian Y.H.

Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019. IEEE Computer Society, 2019. p. 746-757 8731437 (Proceedings - International Conference on Data Engineering; Vol. 2019-April).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hu, J, Cheng, R, Chang, KCC, Sankar, A, Fang, Y & Lam, BYH 2019, Discovering maximal motif cliques in large heterogeneous information networks. in Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019., 8731437, Proceedings - International Conference on Data Engineering, vol. 2019-April, IEEE Computer Society, pp. 746-757, 35th IEEE International Conference on Data Engineering, ICDE 2019, Macau, China, 4/8/19. https://doi.org/10.1109/ICDE.2019.00072
Hu J, Cheng R, Chang KCC, Sankar A, Fang Y, Lam BYH. Discovering maximal motif cliques in large heterogeneous information networks. In Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019. IEEE Computer Society. 2019. p. 746-757. 8731437. (Proceedings - International Conference on Data Engineering). https://doi.org/10.1109/ICDE.2019.00072
Hu, Jiafeng ; Cheng, Reynold ; Chang, Kevin Chen Chuan ; Sankar, Aravind ; Fang, Yixiang ; Lam, Brian Y.H. / Discovering maximal motif cliques in large heterogeneous information networks. Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019. IEEE Computer Society, 2019. pp. 746-757 (Proceedings - International Conference on Data Engineering).
@inproceedings{68cd2053e55241c9875706584c21c3e8,
title = "Discovering maximal motif cliques in large heterogeneous information networks",
abstract = "We study the discovery of cliques (or 'complete' subgraphs) in heterogeneous information networks (HINs). Existing clique-finding solutions often ignore the rich semantics of HINs. We propose motif clique, or m-clique, which redefines subgraph completeness with respect to a given motif. A motif, essentially a small subgraph pattern, is a fundamental building block of an HIN. The m-clique concept is general and allows us to analyse 'complete' subgraphs in an HIN with respect to desired high-order connection patterns. We further investigate the maximal m-clique enumeration problem (MMCE), which finds all maximal m-cliques not contained in any other m-cliques. Because MMCE is NP-hard, developing an accurate and efficient solution for MMCE is not straightforward. We thus present the META algorithm, which employs advanced pruning strategies to effectively reduce the search space. We also design fast techniques to avoid generating duplicated maximal m-clique instances. Our extensive experiments on large real and synthetic HINs show that META is highly effective and efficient.",
keywords = "Clique, Heterogeneous information networks, Motif",
author = "Jiafeng Hu and Reynold Cheng and Chang, {Kevin Chen Chuan} and Aravind Sankar and Yixiang Fang and Lam, {Brian Y.H.}",
year = "2019",
month = "4",
doi = "10.1109/ICDE.2019.00072",
language = "English (US)",
series = "Proceedings - International Conference on Data Engineering",
publisher = "IEEE Computer Society",
pages = "746--757",
booktitle = "Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019",

}

TY - GEN

T1 - Discovering maximal motif cliques in large heterogeneous information networks

AU - Hu, Jiafeng

AU - Cheng, Reynold

AU - Chang, Kevin Chen Chuan

AU - Sankar, Aravind

AU - Fang, Yixiang

AU - Lam, Brian Y.H.

PY - 2019/4

Y1 - 2019/4

N2 - We study the discovery of cliques (or 'complete' subgraphs) in heterogeneous information networks (HINs). Existing clique-finding solutions often ignore the rich semantics of HINs. We propose motif clique, or m-clique, which redefines subgraph completeness with respect to a given motif. A motif, essentially a small subgraph pattern, is a fundamental building block of an HIN. The m-clique concept is general and allows us to analyse 'complete' subgraphs in an HIN with respect to desired high-order connection patterns. We further investigate the maximal m-clique enumeration problem (MMCE), which finds all maximal m-cliques not contained in any other m-cliques. Because MMCE is NP-hard, developing an accurate and efficient solution for MMCE is not straightforward. We thus present the META algorithm, which employs advanced pruning strategies to effectively reduce the search space. We also design fast techniques to avoid generating duplicated maximal m-clique instances. Our extensive experiments on large real and synthetic HINs show that META is highly effective and efficient.

AB - We study the discovery of cliques (or 'complete' subgraphs) in heterogeneous information networks (HINs). Existing clique-finding solutions often ignore the rich semantics of HINs. We propose motif clique, or m-clique, which redefines subgraph completeness with respect to a given motif. A motif, essentially a small subgraph pattern, is a fundamental building block of an HIN. The m-clique concept is general and allows us to analyse 'complete' subgraphs in an HIN with respect to desired high-order connection patterns. We further investigate the maximal m-clique enumeration problem (MMCE), which finds all maximal m-cliques not contained in any other m-cliques. Because MMCE is NP-hard, developing an accurate and efficient solution for MMCE is not straightforward. We thus present the META algorithm, which employs advanced pruning strategies to effectively reduce the search space. We also design fast techniques to avoid generating duplicated maximal m-clique instances. Our extensive experiments on large real and synthetic HINs show that META is highly effective and efficient.

KW - Clique

KW - Heterogeneous information networks

KW - Motif

UR - http://www.scopus.com/inward/record.url?scp=85067987999&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067987999&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2019.00072

DO - 10.1109/ICDE.2019.00072

M3 - Conference contribution

AN - SCOPUS:85067987999

T3 - Proceedings - International Conference on Data Engineering

SP - 746

EP - 757

BT - Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019

PB - IEEE Computer Society

ER -