This paper defines and studies the problem of universal clustering using responses of crowd workers, without knowledge of worker reliability or task difficulty. We model stochastic worker response distributions by incorporating traits of memory for similar objects and traits of distance among differing objects. We are particularly interested in two limiting worker types - temporary and long-term workers, without and with memory respectively. We first define clustering algorithms for these limiting cases and then integrate them into an algorithm for the unified worker model. We prove asymptotic consistency of the algorithms and establish sufficient conditions on the sample complexity of the algorithm. Converse arguments establish necessary conditions on sample complexity, proving that the defined algorithms are asymptotically order-optimal in cost.

Original languageEnglish (US)
Title of host publication2017 IEEE International Symposium on Information Theory, ISIT 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages5
ISBN (Electronic)9781509040964
StatePublished - Aug 9 2017
Event2017 IEEE International Symposium on Information Theory, ISIT 2017 - Aachen, Germany
Duration: Jun 25 2017Jun 30 2017

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
ISSN (Print)2157-8095


Other2017 IEEE International Symposium on Information Theory, ISIT 2017

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics


Dive into the research topics of 'Budget-optimal clustering via crowdsourcing'. Together they form a unique fingerprint.

Cite this