Recursive fact-finding: A streaming approach to truth estimation in crowdsourcing applications

Dong Wang, Tarek Abdelzaher, Lance Kaplan, Charu C. Aggarwal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a streaming approach to solve the truth estimation problem in crowdsourcing applications. We consider a category of crowdsourcing applications where a group of individuals volunteer (or are recruited to) share certain observations or measurements about the physical world. Examples include reporting locations of gas stations that remain operational after a natural disaster or reporting locations of potholes on city streets. We call such applications social sensing. Ascertaining the correctness of reported observations is a key challenge in such applications, referred to as the truth estimation problem. This problem is made difficult by the fact that the reliability of individual sources is usually unknown a priori, since any concerned citizen may, in principle, participate. Moreover, the timescales of crowdsourcing campaigns of interest can be as small as a few hours or days, which does not offer enough history for a reputation system to converge. Instead, recent prior work, including our own, developed fact-finding algorithms to solve this problem by iteratively assessing the credibility of sources and their claims in the absence of reputation scores. Such algorithms, however, operate on the entire dataset of reported observations in a batch fashion, which makes them less suited to applications where new observations arrive continually. In this paper, we describe a streaming fact-finder that recursively updates previous estimates based on new data. The recursive algorithm solves an expectation maximization (EM) problem to determine the odds of correctness of different observations. We compare the performance of our recursive EM algorithm to a batch EM algorithm, as well as to several state-of-art fact-finders through extensive simulations. We also demonstrate convergence of the recursive algorithm to the results of the batch version through a real social sensing experiment. Our evaluation shows that the proposed approach can process data streams much more efficiently while keeping the truth estimation accuracy close to that of the (much slower) batch algorithm. Ours is therefore the first fact-finder developed with explicit consideration to the continuous update needs of crowd-sourcing applications.

Original languageEnglish (US)
Title of host publicationProceedings - 2013 IEEE 33rd International Conference on Distributed Computing Systems, ICDCS 2013
Pages530-539
Number of pages10
DOIs
StatePublished - 2013
Event2013 IEEE 33rd International Conference on Distributed Computing Systems, ICDCS 2013 - Philadelphia, PA, United States
Duration: Jul 8 2013Jul 11 2013

Publication series

NameProceedings - International Conference on Distributed Computing Systems

Other

Other2013 IEEE 33rd International Conference on Distributed Computing Systems, ICDCS 2013
Country/TerritoryUnited States
CityPhiladelphia, PA
Period7/8/137/11/13

Keywords

  • real-time
  • recursive expectation maximization
  • social sensing
  • streaming data
  • truth discovery

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Recursive fact-finding: A streaming approach to truth estimation in crowdsourcing applications'. Together they form a unique fingerprint.

Cite this