TY - GEN
T1 - Discovering maximal motif cliques in large heterogeneous information networks
AU - Hu, Jiafeng
AU - Cheng, Reynold
AU - Chang, Kevin Chen Chuan
AU - Sankar, Aravind
AU - Fang, Yixiang
AU - Lam, Brian Y.H.
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/4
Y1 - 2019/4
N2 - We study the discovery of cliques (or 'complete' subgraphs) in heterogeneous information networks (HINs). Existing clique-finding solutions often ignore the rich semantics of HINs. We propose motif clique, or m-clique, which redefines subgraph completeness with respect to a given motif. A motif, essentially a small subgraph pattern, is a fundamental building block of an HIN. The m-clique concept is general and allows us to analyse 'complete' subgraphs in an HIN with respect to desired high-order connection patterns. We further investigate the maximal m-clique enumeration problem (MMCE), which finds all maximal m-cliques not contained in any other m-cliques. Because MMCE is NP-hard, developing an accurate and efficient solution for MMCE is not straightforward. We thus present the META algorithm, which employs advanced pruning strategies to effectively reduce the search space. We also design fast techniques to avoid generating duplicated maximal m-clique instances. Our extensive experiments on large real and synthetic HINs show that META is highly effective and efficient.
AB - We study the discovery of cliques (or 'complete' subgraphs) in heterogeneous information networks (HINs). Existing clique-finding solutions often ignore the rich semantics of HINs. We propose motif clique, or m-clique, which redefines subgraph completeness with respect to a given motif. A motif, essentially a small subgraph pattern, is a fundamental building block of an HIN. The m-clique concept is general and allows us to analyse 'complete' subgraphs in an HIN with respect to desired high-order connection patterns. We further investigate the maximal m-clique enumeration problem (MMCE), which finds all maximal m-cliques not contained in any other m-cliques. Because MMCE is NP-hard, developing an accurate and efficient solution for MMCE is not straightforward. We thus present the META algorithm, which employs advanced pruning strategies to effectively reduce the search space. We also design fast techniques to avoid generating duplicated maximal m-clique instances. Our extensive experiments on large real and synthetic HINs show that META is highly effective and efficient.
KW - Clique
KW - Heterogeneous information networks
KW - Motif
UR - http://www.scopus.com/inward/record.url?scp=85067987999&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85067987999&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2019.00072
DO - 10.1109/ICDE.2019.00072
M3 - Conference contribution
AN - SCOPUS:85067987999
T3 - Proceedings - International Conference on Data Engineering
SP - 746
EP - 757
BT - Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019
PB - IEEE Computer Society
T2 - 35th IEEE International Conference on Data Engineering, ICDE 2019
Y2 - 8 April 2019 through 11 April 2019
ER -