Provenance-assisted classification in social networks

Dong Wang, Md Tanvir Al Amin, Tarek Abdelzaher, Dan Roth, Clare R. Voss, Lance M. Kaplan, Stephen Tratz, Jamal Laoudi, Douglas Briesch

Research output: Contribution to journalArticlepeer-review

Abstract

Signal feature extraction and classification are two common tasks in the signal processing literature. This paper investigates the use of source identities as a common mechanism for enhancing the classification accuracy of social signals. We define social signals as outputs, such as microblog entries, geotags, or uploaded images, contributed by users in a social network. Many classification tasks can be defined on such outputs. For example, one may want to identify the dialect of a microblog contributed by an author, or classify information referred to in a user's tweet as true or false. While the design of such classifiers is application-specific, social signals share in common one key property: they are augmented by the explicit identity of the source. This motivates investigating whether or not knowing the source of each signal (in addition to exploiting signal features) allows the classification accuracy to be improved. We call it provenance-assisted classification. This paper answers the above question affirmatively, demonstrating how source identities can improve classification accuracy, and derives confidence bounds to quantify the accuracy of results. Evaluation is performed in two real-world contexts: (i) fact-finding that classifies microblog entries into true and false, and (ii) language classification of tweets issued by a set of possibly multi-lingual speakers. We also carry out extensive simulation experiments to further evaluate the performance of the proposed classification scheme over different problem dimensions. The results show that provenance features significantly improve classification accuracy of social signals, even when no information is known about the sources (besides their ID). This observation offers a general mechanism for enhancing classification results in social networks.

Original languageEnglish (US)
Article number6766747
Pages (from-to)624-637
Number of pages14
JournalIEEE Journal on Selected Topics in Signal Processing
Volume8
Issue number4
DOIs
StatePublished - Aug 2014

Keywords

  • Social signals
  • classification
  • expectation maximization
  • maximum likelihood estimation
  • signal feature extraction
  • uncertain provenance

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Provenance-assisted classification in social networks'. Together they form a unique fingerprint.

Cite this