TY - GEN
T1 - GIN
T2 - SIAM International Conference on Data Mining 2015, SDM 2015
AU - Liu, Jialu
AU - Wang, Chi
AU - Gao, Jing
AU - Gu, Quanquan
AU - Aggarwal, Charu
AU - Kaplan, Lance
AU - Han, Jiawei
N1 - Publisher Copyright:
Copyright © SIAM.
PY - 2015
Y1 - 2015
N2 - Networked data often consists of interconnected multi-typed nodes and links. A common assumption behind such heterogeneity is the shared clustering structure. However, existing network clustering approaches oversimplify the heterogeneity by either treating nodes or links in a homogeneous fashion, resulting in massive loss of information. In addition, these studies are more or less restricted to specific network schémas or applications, losing generality. In this paper, we introduce a flexible model to explain the process of forming heterogeneous links based on shared clustering information of heterogeneous nodes. Specifically, we categorize the link generation process into binary and weighted cases and model them respectively. We show these two cases can be seamlessly integrated into a unified model. We propose to maximize a joint log-likelihood function to infer the model efficiently with Expectation Maximization (EM) algorithms. Experiments on real-world networked data sets demonstrate the effectiveness and flexibility of the proposed method in fully capturing the dual heterogeneity of both nodes and links.
AB - Networked data often consists of interconnected multi-typed nodes and links. A common assumption behind such heterogeneity is the shared clustering structure. However, existing network clustering approaches oversimplify the heterogeneity by either treating nodes or links in a homogeneous fashion, resulting in massive loss of information. In addition, these studies are more or less restricted to specific network schémas or applications, losing generality. In this paper, we introduce a flexible model to explain the process of forming heterogeneous links based on shared clustering information of heterogeneous nodes. Specifically, we categorize the link generation process into binary and weighted cases and model them respectively. We show these two cases can be seamlessly integrated into a unified model. We propose to maximize a joint log-likelihood function to infer the model efficiently with Expectation Maximization (EM) algorithms. Experiments on real-world networked data sets demonstrate the effectiveness and flexibility of the proposed method in fully capturing the dual heterogeneity of both nodes and links.
UR - http://www.scopus.com/inward/record.url?scp=84961876915&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84961876915&partnerID=8YFLogxK
U2 - 10.1137/1.9781611974010.44
DO - 10.1137/1.9781611974010.44
M3 - Conference contribution
AN - SCOPUS:84961876915
T3 - SIAM International Conference on Data Mining 2015, SDM 2015
SP - 388
EP - 396
BT - SIAM International Conference on Data Mining 2015, SDM 2015
A2 - Venkatasubramanian, Suresh
A2 - Ye, Jieping
PB - Society for Industrial and Applied Mathematics Publications
Y2 - 30 April 2015 through 2 May 2015
ER -