Improving Consistency of Crowdsourced Multimedia Similarity for Evaluation

Peter Organisciak, J. Stephen Downie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Building evaluation datasets for information retrieval is a time-consuming and exhausting activity. To evaluate research over novel corpora, researchers are increasingly turning to crowdsourcing to efficiently distribute the evaluation dataset creation among many workers. However, there has been little investigation into the effect of instrument design on data quality in crowdsourced evaluation datasets. We pursue this question through a case study, music similarity judgments in a music digital library evaluation, where we find that even with trusted graders song pairs are not consistently rated the same. We find that much of this low intra-coder consistency can be attributed to the task design and judge effects, concluding with recommendations for achieving reliable evaluation judgments for music similarity and other normative judgment tasks.

Original languageEnglish (US)
Title of host publicationJCDL 2015 - Proceedings of the 15th ACM/IEEE-CE Joint Conference on Digital Libraries
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages115-118
Number of pages4
ISBN (Electronic)9781450335942
DOIs
StatePublished - Jun 21 2015
Event15th ACM/IEEE-CE Joint Conference on Digital Libraries, JCDL 2015 - Knoxville, United States
Duration: Jun 21 2015Jun 25 2015

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
Volume2015-June
ISSN (Print)1552-5996

Other

Other15th ACM/IEEE-CE Joint Conference on Digital Libraries, JCDL 2015
Country/TerritoryUnited States
CityKnoxville
Period6/21/156/25/15

Keywords

  • crowdsourcing
  • music retrieval
  • similarity judgments

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Improving Consistency of Crowdsourced Multimedia Similarity for Evaluation'. Together they form a unique fingerprint.

Cite this