TY - GEN
T1 - PaCK
T2 - 9th SIAM International Conference on Data Mining 2009, SDM 2009
AU - He, Jingrui
AU - Tong, Hanghang
AU - Papadimitriou, Spiros
AU - Eliassi-Rad, Tina
AU - Faloutsos, Christos
AU - Carbonell, Jaime
PY - 2009
Y1 - 2009
N2 - Given an author-paper-conference graph, how can we automatically find groups for author, paper and conference respectively. Existing work either (1) requires fine tuning of several parameters, or (2) can only be applied to bipartite graphs (e.g., author-paper graph, or paper-conference graph). To address this problem, in this paper, we propose PaCK for clustering such k-partite graphs. By optimizing an information-theoretic criterion, PaCK searches for the best number of clusters for each type of object and generates the corresponding clustering. The unique feature of PaCK over existing methods for clustering k-partite graphs lies in its parameter-free nature. Furthermore, it can be easily generalized to the cases where certain connectivity relations are expressed as tensors, e.g., time-evolving data. The proposed algorithm is scalable in the sense that it is linear with respect to the total number of edges in the graphs. We present the theoretical analysis as well as the experimental evaluations to demonstrate both its effectiveness and efficiency.
AB - Given an author-paper-conference graph, how can we automatically find groups for author, paper and conference respectively. Existing work either (1) requires fine tuning of several parameters, or (2) can only be applied to bipartite graphs (e.g., author-paper graph, or paper-conference graph). To address this problem, in this paper, we propose PaCK for clustering such k-partite graphs. By optimizing an information-theoretic criterion, PaCK searches for the best number of clusters for each type of object and generates the corresponding clustering. The unique feature of PaCK over existing methods for clustering k-partite graphs lies in its parameter-free nature. Furthermore, it can be easily generalized to the cases where certain connectivity relations are expressed as tensors, e.g., time-evolving data. The proposed algorithm is scalable in the sense that it is linear with respect to the total number of edges in the graphs. We present the theoretical analysis as well as the experimental evaluations to demonstrate both its effectiveness and efficiency.
UR - http://www.scopus.com/inward/record.url?scp=73449135163&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=73449135163&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:73449135163
SN - 9781615671090
T3 - Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics
SP - 1278
EP - 1287
BT - Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics 133
Y2 - 30 April 2009 through 2 May 2009
ER -