TY - GEN
T1 - On clustering heterogeneous social media objects with outlier links
AU - Qi, Guo Jun
AU - Aggarwal, Charu C.
AU - Huang, Thomas S.
PY - 2012
Y1 - 2012
N2 - The clustering of social media objects provides intrinsic understanding of the similarity relationships between documents, images, and their contextual sources. Both content and link structure provide important cues for an effective clustering algorithm of the underlying objects. While link information provides useful hints for improving the clustering process, it also contains a significant amount of noisy information. Therefore, a robust clustering algorithm is required to reduce the impact of noisy links. In order to address the aforementioned problems, we propose heterogeneous random fields to model the structure and content of social media networks. We design a probability measure on the social media networks which output a configuration of clusters that are consistent with both content and link structure. Furthermore, noisy links can also be detected, and their impact on the clustering algorithm can be significantly reduced. We conduct experiments on a real social media network and show the advantage of the method over other state-of-the-art algorithms.
AB - The clustering of social media objects provides intrinsic understanding of the similarity relationships between documents, images, and their contextual sources. Both content and link structure provide important cues for an effective clustering algorithm of the underlying objects. While link information provides useful hints for improving the clustering process, it also contains a significant amount of noisy information. Therefore, a robust clustering algorithm is required to reduce the impact of noisy links. In order to address the aforementioned problems, we propose heterogeneous random fields to model the structure and content of social media networks. We design a probability measure on the social media networks which output a configuration of clusters that are consistent with both content and link structure. Furthermore, noisy links can also be detected, and their impact on the clustering algorithm can be significantly reduced. We conduct experiments on a real social media network and show the advantage of the method over other state-of-the-art algorithms.
KW - Noisy links
KW - Robust clustering
KW - Social media networks
UR - http://www.scopus.com/inward/record.url?scp=84858063639&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84858063639&partnerID=8YFLogxK
U2 - 10.1145/2124295.2124363
DO - 10.1145/2124295.2124363
M3 - Conference contribution
AN - SCOPUS:84858063639
SN - 9781450307475
T3 - WSDM 2012 - Proceedings of the 5th ACM International Conference on Web Search and Data Mining
SP - 553
EP - 562
BT - WSDM 2012 - Proceedings of the 5th ACM International Conference on Web Search and Data Mining
T2 - 5th ACM International Conference on Web Search and Data Mining, WSDM 2012
Y2 - 8 February 2012 through 12 February 2012
ER -