Enhancing reliability using peer consistency evaluation in human computation

Shih Wen Huang, Wai Tat Fu

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Peer consistency evaluation is often used in games with a purpose (GWAP) to evaluate workers using outputs of other workers without using gold standard answers. Despite its popularity, the reliability of peer consistency evaluation has never been systematically tested to show how it can be used as a general evaluation method in human computation systems. We present experimental results that show that human computation systems using peer consistency evaluation can lead to outcomes that are even better than those that evaluate workers using gold standard answers. We also show that even without evaluation, simply telling the workers that their answers will be used as future evaluation standards can significantly enhance the workers' performance. Results have important implication for methods that improve the reliability of human computation systems.

Original languageEnglish (US)
Title of host publicationCSCW 2013 - Proceedings of the 2013 ACM Conference on Computer Supported Cooperative Work
Number of pages9
StatePublished - Mar 18 2013
Event2013 2nd ACM Conference on Computer Supported Cooperative Work, CSCW 2013 - San Antonio, TX, United States
Duration: Feb 23 2013Feb 27 2013

Publication series

NameProceedings of the ACM Conference on Computer Supported Cooperative Work, CSCW


Other2013 2nd ACM Conference on Computer Supported Cooperative Work, CSCW 2013
Country/TerritoryUnited States
CitySan Antonio, TX


  • Crowdsourcing
  • Evaluation
  • Human computation
  • Mechanical turk
  • User behavior

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Networks and Communications

Cite this