TY - JOUR
T1 - From community detection to community profiling
AU - Cai, Hongyun
AU - Zheng, Vincent W.
AU - Zhu, Fanwei
AU - Chang, Kevin Chen Chuan
AU - Huang, Zi
N1 - Publisher Copyright:
© 2017 VLDB Endowment.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2017
Y1 - 2017
N2 - Most existing community-related studies focus on detection, which aim to find the community membership for each user from user friendship links. However, membership alone, without a complete profile of what a community is and how it interacts with other communities, has limited applications. This motivates us to consider systematically profiling the communities and thereby developing useful community-level applications. In this paper, we for the first time formalize the concept of community profiling. With rich user information on the network, such as user published content and user diffusion links, we characterize a community in terms of both its internal content profile and external diffusion profile. The difficulty of community profiling is often underestimated. We novelly identify three unique challenges and propose a joint Community Profiling and Detection (CPD) model to address them accordingly. We also contribute a scalable inference algorithm, which scales linearly with the data size and it is easily parallelizable. We evaluate CPD on large-scale real-world data sets, and show that it is significantly better than the state-of-the-art baselines in various tasks.
AB - Most existing community-related studies focus on detection, which aim to find the community membership for each user from user friendship links. However, membership alone, without a complete profile of what a community is and how it interacts with other communities, has limited applications. This motivates us to consider systematically profiling the communities and thereby developing useful community-level applications. In this paper, we for the first time formalize the concept of community profiling. With rich user information on the network, such as user published content and user diffusion links, we characterize a community in terms of both its internal content profile and external diffusion profile. The difficulty of community profiling is often underestimated. We novelly identify three unique challenges and propose a joint Community Profiling and Detection (CPD) model to address them accordingly. We also contribute a scalable inference algorithm, which scales linearly with the data size and it is easily parallelizable. We evaluate CPD on large-scale real-world data sets, and show that it is significantly better than the state-of-the-art baselines in various tasks.
UR - http://www.scopus.com/inward/record.url?scp=85021220508&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85021220508&partnerID=8YFLogxK
U2 - 10.14778/3067421.3067430
DO - 10.14778/3067421.3067430
M3 - Conference article
AN - SCOPUS:85021220508
VL - 10
SP - 817
EP - 828
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
SN - 2150-8097
IS - 7
T2 - 43rd International Conference on Very Large Data Bases, VLDB 2017
Y2 - 28 August 2017 through 1 September 2017
ER -