Abstract
Most existing community-related studies focus on detection, which aim to find the community membership for each user from user friendship links. However, membership alone, without a complete profile of what a community is and how it interacts with other communities, has limited applications. This motivates us to consider systematically profiling the communities and thereby developing useful community-level applications. In this paper, we for the first time formalize the concept of community profiling. With rich user information on the network, such as user published content and user diffusion links, we characterize a community in terms of both its internal content profile and external diffusion profile. The difficulty of community profiling is often underestimated. We novelly identify three unique challenges and propose a joint Community Profiling and Detection (CPD) model to address them accordingly. We also contribute a scalable inference algorithm, which scales linearly with the data size and it is easily parallelizable. We evaluate CPD on large-scale real-world data sets, and show that it is significantly better than the state-of-the-art baselines in various tasks.
Original language | English (US) |
---|---|
Pages (from-to) | 817-828 |
Number of pages | 12 |
Journal | Proceedings of the VLDB Endowment |
Volume | 10 |
Issue number | 7 |
DOIs | |
State | Published - 2017 |
Event | 43rd International Conference on Very Large Data Bases, VLDB 2017 - Munich, Germany Duration: Aug 28 2017 → Sep 1 2017 |
ASJC Scopus subject areas
- Computer Science (miscellaneous)
- General Computer Science