On clustering heterogeneous social media objects with outlier links

Guo Jun Qi, Charu C. Aggarwal, Thomas S. Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The clustering of social media objects provides intrinsic understanding of the similarity relationships between documents, images, and their contextual sources. Both content and link structure provide important cues for an effective clustering algorithm of the underlying objects. While link information provides useful hints for improving the clustering process, it also contains a significant amount of noisy information. Therefore, a robust clustering algorithm is required to reduce the impact of noisy links. In order to address the aforementioned problems, we propose heterogeneous random fields to model the structure and content of social media networks. We design a probability measure on the social media networks which output a configuration of clusters that are consistent with both content and link structure. Furthermore, noisy links can also be detected, and their impact on the clustering algorithm can be significantly reduced. We conduct experiments on a real social media network and show the advantage of the method over other state-of-the-art algorithms.

Original languageEnglish (US)
Title of host publicationWSDM 2012 - Proceedings of the 5th ACM International Conference on Web Search and Data Mining
Pages553-562
Number of pages10
DOIs
StatePublished - 2012
Externally publishedYes
Event5th ACM International Conference on Web Search and Data Mining, WSDM 2012 - Seattle, WA, United States
Duration: Feb 8 2012Feb 12 2012

Publication series

NameWSDM 2012 - Proceedings of the 5th ACM International Conference on Web Search and Data Mining

Other

Other5th ACM International Conference on Web Search and Data Mining, WSDM 2012
Country/TerritoryUnited States
CitySeattle, WA
Period2/8/122/12/12

Keywords

  • Noisy links
  • Robust clustering
  • Social media networks

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'On clustering heterogeneous social media objects with outlier links'. Together they form a unique fingerprint.

Cite this