A semi-supervised active-learning truth estimator for social networks

Hang Cui, Tarek Abdelzaher, Lance Kaplan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper introduces an active-learning-based truth estimator for social networks, such as Twitter, that enhances estimation accuracy significantly by requesting a well-selected (small) fraction of data to be labeled. Data assessment and truth discovery from arbitrary open online sources are a hard problem due to uncertainty regarding source reliability. Multiple truth finding systems were developed to solve this problem. Their accuracy is limited by the noisy nature of the data, where distortions, fabrications, omissions, and duplication are introduced. This paper presents a semi-supervised truth estimator for social networks, in which a portion of inputs are carefully selected to be reliably verified. The challenge is to find the subset of observations to verify that would maximally enhance the overall fact-finding accuracy. This work extends previous passive approaches to recursive truth estimation, as well as semi-supervised approaches where the estimator has no control over the choice of data to be labeled. Results show that by optimally selecting claims to be verified, we improve estimated accuracy by 12% over unsupervised baseline, and by 5% over previous semi-supervised approaches.

Original languageEnglish (US)
Title of host publicationThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PublisherAssociation for Computing Machinery, Inc
Pages296-306
Number of pages11
ISBN (Electronic)9781450366748
DOIs
StatePublished - May 13 2019
Event2019 World Wide Web Conference, WWW 2019 - San Francisco, United States
Duration: May 13 2019May 17 2019

Publication series

NameThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

Conference

Conference2019 World Wide Web Conference, WWW 2019
CountryUnited States
CitySan Francisco
Period5/13/195/17/19

Fingerprint

Fabrication
Problem-Based Learning
Uncertainty

Keywords

  • Active Learning
  • Maximum Likelihood Estimation
  • Semi Supervision
  • Social Sensing
  • Truth Discovery

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software

Cite this

Cui, H., Abdelzaher, T., & Kaplan, L. (2019). A semi-supervised active-learning truth estimator for social networks. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019 (pp. 296-306). (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019). Association for Computing Machinery, Inc. https://doi.org/10.1145/3308558.3313712

A semi-supervised active-learning truth estimator for social networks. / Cui, Hang; Abdelzaher, Tarek; Kaplan, Lance.

The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc, 2019. p. 296-306 (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cui, H, Abdelzaher, T & Kaplan, L 2019, A semi-supervised active-learning truth estimator for social networks. in The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, Association for Computing Machinery, Inc, pp. 296-306, 2019 World Wide Web Conference, WWW 2019, San Francisco, United States, 5/13/19. https://doi.org/10.1145/3308558.3313712
Cui H, Abdelzaher T, Kaplan L. A semi-supervised active-learning truth estimator for social networks. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc. 2019. p. 296-306. (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019). https://doi.org/10.1145/3308558.3313712
Cui, Hang ; Abdelzaher, Tarek ; Kaplan, Lance. / A semi-supervised active-learning truth estimator for social networks. The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc, 2019. pp. 296-306 (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019).
@inproceedings{5ad16b5a9853437888447df56252049f,
title = "A semi-supervised active-learning truth estimator for social networks",
abstract = "This paper introduces an active-learning-based truth estimator for social networks, such as Twitter, that enhances estimation accuracy significantly by requesting a well-selected (small) fraction of data to be labeled. Data assessment and truth discovery from arbitrary open online sources are a hard problem due to uncertainty regarding source reliability. Multiple truth finding systems were developed to solve this problem. Their accuracy is limited by the noisy nature of the data, where distortions, fabrications, omissions, and duplication are introduced. This paper presents a semi-supervised truth estimator for social networks, in which a portion of inputs are carefully selected to be reliably verified. The challenge is to find the subset of observations to verify that would maximally enhance the overall fact-finding accuracy. This work extends previous passive approaches to recursive truth estimation, as well as semi-supervised approaches where the estimator has no control over the choice of data to be labeled. Results show that by optimally selecting claims to be verified, we improve estimated accuracy by 12{\%} over unsupervised baseline, and by 5{\%} over previous semi-supervised approaches.",
keywords = "Active Learning, Maximum Likelihood Estimation, Semi Supervision, Social Sensing, Truth Discovery",
author = "Hang Cui and Tarek Abdelzaher and Lance Kaplan",
year = "2019",
month = "5",
day = "13",
doi = "10.1145/3308558.3313712",
language = "English (US)",
series = "The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019",
publisher = "Association for Computing Machinery, Inc",
pages = "296--306",
booktitle = "The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019",

}

TY - GEN

T1 - A semi-supervised active-learning truth estimator for social networks

AU - Cui, Hang

AU - Abdelzaher, Tarek

AU - Kaplan, Lance

PY - 2019/5/13

Y1 - 2019/5/13

N2 - This paper introduces an active-learning-based truth estimator for social networks, such as Twitter, that enhances estimation accuracy significantly by requesting a well-selected (small) fraction of data to be labeled. Data assessment and truth discovery from arbitrary open online sources are a hard problem due to uncertainty regarding source reliability. Multiple truth finding systems were developed to solve this problem. Their accuracy is limited by the noisy nature of the data, where distortions, fabrications, omissions, and duplication are introduced. This paper presents a semi-supervised truth estimator for social networks, in which a portion of inputs are carefully selected to be reliably verified. The challenge is to find the subset of observations to verify that would maximally enhance the overall fact-finding accuracy. This work extends previous passive approaches to recursive truth estimation, as well as semi-supervised approaches where the estimator has no control over the choice of data to be labeled. Results show that by optimally selecting claims to be verified, we improve estimated accuracy by 12% over unsupervised baseline, and by 5% over previous semi-supervised approaches.

AB - This paper introduces an active-learning-based truth estimator for social networks, such as Twitter, that enhances estimation accuracy significantly by requesting a well-selected (small) fraction of data to be labeled. Data assessment and truth discovery from arbitrary open online sources are a hard problem due to uncertainty regarding source reliability. Multiple truth finding systems were developed to solve this problem. Their accuracy is limited by the noisy nature of the data, where distortions, fabrications, omissions, and duplication are introduced. This paper presents a semi-supervised truth estimator for social networks, in which a portion of inputs are carefully selected to be reliably verified. The challenge is to find the subset of observations to verify that would maximally enhance the overall fact-finding accuracy. This work extends previous passive approaches to recursive truth estimation, as well as semi-supervised approaches where the estimator has no control over the choice of data to be labeled. Results show that by optimally selecting claims to be verified, we improve estimated accuracy by 12% over unsupervised baseline, and by 5% over previous semi-supervised approaches.

KW - Active Learning

KW - Maximum Likelihood Estimation

KW - Semi Supervision

KW - Social Sensing

KW - Truth Discovery

UR - http://www.scopus.com/inward/record.url?scp=85066905643&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066905643&partnerID=8YFLogxK

U2 - 10.1145/3308558.3313712

DO - 10.1145/3308558.3313712

M3 - Conference contribution

T3 - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

SP - 296

EP - 306

BT - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

PB - Association for Computing Machinery, Inc

ER -