Splog detection using content, time and link structures

Yu Ru Lin, Hari Sundaram, Yun Chi, Jun Tatemura, Belle Tseng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper focuses on spam Hog (splog) detection. Blogs are highly popular, new media social communication mechanisms and splogs corrupt blog search results as well as waste network resources. In our approach we exploit unique blog temporal dynamics to detect splogs. The key idea is that splogs exhibit high temporal regularity in content and post time, as well as consistent linking patterns. Temporal content regularity is detected using a novel autocorrelation of post content. Temporal structural regularity is determined using the entropy of the post time difference distribution, while the link regularity is computed using a HITS based hub score measure. Experiments based on the annotated ground truth on real world dataset show excellent results on splog detection tasks with 90% accuracy.

Original languageEnglish (US)
Title of host publicationProceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007
PublisherIEEE Computer Society
Pages2030-2033
Number of pages4
ISBN (Print)1424410177, 9781424410170
DOIs
StatePublished - 2007
Externally publishedYes
EventIEEE International Conference onMultimedia and Expo, ICME 2007 - Beijing, China
Duration: Jul 2 2007Jul 5 2007

Publication series

NameProceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007

Other

OtherIEEE International Conference onMultimedia and Expo, ICME 2007
Country/TerritoryChina
CityBeijing
Period7/2/077/5/07

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Software

Fingerprint

Dive into the research topics of 'Splog detection using content, time and link structures'. Together they form a unique fingerprint.

Cite this