Budget-optimal clustering via crowdsourcing

Ravi Kiran Raman, Lav R Varshney

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper defines and studies the problem of universal clustering using responses of crowd workers, without knowledge of worker reliability or task difficulty. We model stochastic worker response distributions by incorporating traits of memory for similar objects and traits of distance among differing objects. We are particularly interested in two limiting worker types - temporary and long-term workers, without and with memory respectively. We first define clustering algorithms for these limiting cases and then integrate them into an algorithm for the unified worker model. We prove asymptotic consistency of the algorithms and establish sufficient conditions on the sample complexity of the algorithm. Converse arguments establish necessary conditions on sample complexity, proving that the defined algorithms are asymptotically order-optimal in cost.

Original languageEnglish (US)
Title of host publication2017 IEEE International Symposium on Information Theory, ISIT 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2163-2167
Number of pages5
ISBN (Electronic)9781509040964
DOIs
StatePublished - Aug 9 2017
Event2017 IEEE International Symposium on Information Theory, ISIT 2017 - Aachen, Germany
Duration: Jun 25 2017Jun 30 2017

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
ISSN (Print)2157-8095

Other

Other2017 IEEE International Symposium on Information Theory, ISIT 2017
CountryGermany
CityAachen
Period6/25/176/30/17

Fingerprint

Clustering
Limiting
Data storage equipment
Stochastic models
Clustering algorithms
Converse
Clustering Algorithm
Stochastic Model
Integrate
Necessary Conditions
Sufficient Conditions
Costs
Object
Model
Knowledge

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Cite this

Raman, R. K., & Varshney, L. R. (2017). Budget-optimal clustering via crowdsourcing. In 2017 IEEE International Symposium on Information Theory, ISIT 2017 (pp. 2163-2167). [8006912] (IEEE International Symposium on Information Theory - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISIT.2017.8006912

Budget-optimal clustering via crowdsourcing. / Raman, Ravi Kiran; Varshney, Lav R.

2017 IEEE International Symposium on Information Theory, ISIT 2017. Institute of Electrical and Electronics Engineers Inc., 2017. p. 2163-2167 8006912 (IEEE International Symposium on Information Theory - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Raman, RK & Varshney, LR 2017, Budget-optimal clustering via crowdsourcing. in 2017 IEEE International Symposium on Information Theory, ISIT 2017., 8006912, IEEE International Symposium on Information Theory - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 2163-2167, 2017 IEEE International Symposium on Information Theory, ISIT 2017, Aachen, Germany, 6/25/17. https://doi.org/10.1109/ISIT.2017.8006912
Raman RK, Varshney LR. Budget-optimal clustering via crowdsourcing. In 2017 IEEE International Symposium on Information Theory, ISIT 2017. Institute of Electrical and Electronics Engineers Inc. 2017. p. 2163-2167. 8006912. (IEEE International Symposium on Information Theory - Proceedings). https://doi.org/10.1109/ISIT.2017.8006912
Raman, Ravi Kiran ; Varshney, Lav R. / Budget-optimal clustering via crowdsourcing. 2017 IEEE International Symposium on Information Theory, ISIT 2017. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 2163-2167 (IEEE International Symposium on Information Theory - Proceedings).
@inproceedings{8ce8bf949dfb4befa2d9b88f2f6437bf,
title = "Budget-optimal clustering via crowdsourcing",
abstract = "This paper defines and studies the problem of universal clustering using responses of crowd workers, without knowledge of worker reliability or task difficulty. We model stochastic worker response distributions by incorporating traits of memory for similar objects and traits of distance among differing objects. We are particularly interested in two limiting worker types - temporary and long-term workers, without and with memory respectively. We first define clustering algorithms for these limiting cases and then integrate them into an algorithm for the unified worker model. We prove asymptotic consistency of the algorithms and establish sufficient conditions on the sample complexity of the algorithm. Converse arguments establish necessary conditions on sample complexity, proving that the defined algorithms are asymptotically order-optimal in cost.",
author = "Raman, {Ravi Kiran} and Varshney, {Lav R}",
year = "2017",
month = "8",
day = "9",
doi = "10.1109/ISIT.2017.8006912",
language = "English (US)",
series = "IEEE International Symposium on Information Theory - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "2163--2167",
booktitle = "2017 IEEE International Symposium on Information Theory, ISIT 2017",
address = "United States",

}

TY - GEN

T1 - Budget-optimal clustering via crowdsourcing

AU - Raman, Ravi Kiran

AU - Varshney, Lav R

PY - 2017/8/9

Y1 - 2017/8/9

N2 - This paper defines and studies the problem of universal clustering using responses of crowd workers, without knowledge of worker reliability or task difficulty. We model stochastic worker response distributions by incorporating traits of memory for similar objects and traits of distance among differing objects. We are particularly interested in two limiting worker types - temporary and long-term workers, without and with memory respectively. We first define clustering algorithms for these limiting cases and then integrate them into an algorithm for the unified worker model. We prove asymptotic consistency of the algorithms and establish sufficient conditions on the sample complexity of the algorithm. Converse arguments establish necessary conditions on sample complexity, proving that the defined algorithms are asymptotically order-optimal in cost.

AB - This paper defines and studies the problem of universal clustering using responses of crowd workers, without knowledge of worker reliability or task difficulty. We model stochastic worker response distributions by incorporating traits of memory for similar objects and traits of distance among differing objects. We are particularly interested in two limiting worker types - temporary and long-term workers, without and with memory respectively. We first define clustering algorithms for these limiting cases and then integrate them into an algorithm for the unified worker model. We prove asymptotic consistency of the algorithms and establish sufficient conditions on the sample complexity of the algorithm. Converse arguments establish necessary conditions on sample complexity, proving that the defined algorithms are asymptotically order-optimal in cost.

UR - http://www.scopus.com/inward/record.url?scp=85034069747&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85034069747&partnerID=8YFLogxK

U2 - 10.1109/ISIT.2017.8006912

DO - 10.1109/ISIT.2017.8006912

M3 - Conference contribution

AN - SCOPUS:85034069747

T3 - IEEE International Symposium on Information Theory - Proceedings

SP - 2163

EP - 2167

BT - 2017 IEEE International Symposium on Information Theory, ISIT 2017

PB - Institute of Electrical and Electronics Engineers Inc.

ER -