Optimal PAC multiple arm identification with applications to crowdsourcing

Yuan Zhou, Xi Chen, Jian Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We study the problem of selecting K arms with the highest expected rewards in a stochastic n-armed bandit game. Instead of using existing evaluation metrics (e.g., misidentification probability (Bubeck et al., 2013) or the metric in EXPLORE-K (Kalyanakrishnan & Stone, 2010)), we propose to use the aggregate regret, which is defined as the gap between the average reward of the optimal solution and that of our solution. Besides being a natural metric by itself, we argue that in many applications, such as our motivating example from crowdsourcing, the aggregate regret bound is more suitable. We propose a new PAC algorithm, which, with probability at least 1 - δ, identifies a set of K arms with regret at most ε. We provide the sample complexity bound of our algorithm. To complement, we establish the lower bound and show that the sample complexity of our algorithm matches the lower bound. Finally, we report experimental results on both synthetic and real data sets, which demonstrates the superior performance of the proposed algorithm.

Original languageEnglish (US)
Title of host publication31st International Conference on Machine Learning, ICML 2014
PublisherInternational Machine Learning Society (IMLS)
Pages1446-1469
Number of pages24
ISBN (Electronic)9781634393973
StatePublished - Jan 1 2014
Externally publishedYes
Event31st International Conference on Machine Learning, ICML 2014 - Beijing, China
Duration: Jun 21 2014Jun 26 2014

Publication series

Name31st International Conference on Machine Learning, ICML 2014
Volume2

Other

Other31st International Conference on Machine Learning, ICML 2014
CountryChina
CityBeijing
Period6/21/146/26/14

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Software

Cite this

Zhou, Y., Chen, X., & Li, J. (2014). Optimal PAC multiple arm identification with applications to crowdsourcing. In 31st International Conference on Machine Learning, ICML 2014 (pp. 1446-1469). (31st International Conference on Machine Learning, ICML 2014; Vol. 2). International Machine Learning Society (IMLS).

Optimal PAC multiple arm identification with applications to crowdsourcing. / Zhou, Yuan; Chen, Xi; Li, Jian.

31st International Conference on Machine Learning, ICML 2014. International Machine Learning Society (IMLS), 2014. p. 1446-1469 (31st International Conference on Machine Learning, ICML 2014; Vol. 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhou, Y, Chen, X & Li, J 2014, Optimal PAC multiple arm identification with applications to crowdsourcing. in 31st International Conference on Machine Learning, ICML 2014. 31st International Conference on Machine Learning, ICML 2014, vol. 2, International Machine Learning Society (IMLS), pp. 1446-1469, 31st International Conference on Machine Learning, ICML 2014, Beijing, China, 6/21/14.
Zhou Y, Chen X, Li J. Optimal PAC multiple arm identification with applications to crowdsourcing. In 31st International Conference on Machine Learning, ICML 2014. International Machine Learning Society (IMLS). 2014. p. 1446-1469. (31st International Conference on Machine Learning, ICML 2014).
Zhou, Yuan ; Chen, Xi ; Li, Jian. / Optimal PAC multiple arm identification with applications to crowdsourcing. 31st International Conference on Machine Learning, ICML 2014. International Machine Learning Society (IMLS), 2014. pp. 1446-1469 (31st International Conference on Machine Learning, ICML 2014).
@inproceedings{cfc004db8f0b4b5abf2b4edecac2269a,
title = "Optimal PAC multiple arm identification with applications to crowdsourcing",
abstract = "We study the problem of selecting K arms with the highest expected rewards in a stochastic n-armed bandit game. Instead of using existing evaluation metrics (e.g., misidentification probability (Bubeck et al., 2013) or the metric in EXPLORE-K (Kalyanakrishnan & Stone, 2010)), we propose to use the aggregate regret, which is defined as the gap between the average reward of the optimal solution and that of our solution. Besides being a natural metric by itself, we argue that in many applications, such as our motivating example from crowdsourcing, the aggregate regret bound is more suitable. We propose a new PAC algorithm, which, with probability at least 1 - δ, identifies a set of K arms with regret at most ε. We provide the sample complexity bound of our algorithm. To complement, we establish the lower bound and show that the sample complexity of our algorithm matches the lower bound. Finally, we report experimental results on both synthetic and real data sets, which demonstrates the superior performance of the proposed algorithm.",
author = "Yuan Zhou and Xi Chen and Jian Li",
year = "2014",
month = "1",
day = "1",
language = "English (US)",
series = "31st International Conference on Machine Learning, ICML 2014",
publisher = "International Machine Learning Society (IMLS)",
pages = "1446--1469",
booktitle = "31st International Conference on Machine Learning, ICML 2014",

}

TY - GEN

T1 - Optimal PAC multiple arm identification with applications to crowdsourcing

AU - Zhou, Yuan

AU - Chen, Xi

AU - Li, Jian

PY - 2014/1/1

Y1 - 2014/1/1

N2 - We study the problem of selecting K arms with the highest expected rewards in a stochastic n-armed bandit game. Instead of using existing evaluation metrics (e.g., misidentification probability (Bubeck et al., 2013) or the metric in EXPLORE-K (Kalyanakrishnan & Stone, 2010)), we propose to use the aggregate regret, which is defined as the gap between the average reward of the optimal solution and that of our solution. Besides being a natural metric by itself, we argue that in many applications, such as our motivating example from crowdsourcing, the aggregate regret bound is more suitable. We propose a new PAC algorithm, which, with probability at least 1 - δ, identifies a set of K arms with regret at most ε. We provide the sample complexity bound of our algorithm. To complement, we establish the lower bound and show that the sample complexity of our algorithm matches the lower bound. Finally, we report experimental results on both synthetic and real data sets, which demonstrates the superior performance of the proposed algorithm.

AB - We study the problem of selecting K arms with the highest expected rewards in a stochastic n-armed bandit game. Instead of using existing evaluation metrics (e.g., misidentification probability (Bubeck et al., 2013) or the metric in EXPLORE-K (Kalyanakrishnan & Stone, 2010)), we propose to use the aggregate regret, which is defined as the gap between the average reward of the optimal solution and that of our solution. Besides being a natural metric by itself, we argue that in many applications, such as our motivating example from crowdsourcing, the aggregate regret bound is more suitable. We propose a new PAC algorithm, which, with probability at least 1 - δ, identifies a set of K arms with regret at most ε. We provide the sample complexity bound of our algorithm. To complement, we establish the lower bound and show that the sample complexity of our algorithm matches the lower bound. Finally, we report experimental results on both synthetic and real data sets, which demonstrates the superior performance of the proposed algorithm.

UR - http://www.scopus.com/inward/record.url?scp=84919921416&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84919921416&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84919921416

T3 - 31st International Conference on Machine Learning, ICML 2014

SP - 1446

EP - 1469

BT - 31st International Conference on Machine Learning, ICML 2014

PB - International Machine Learning Society (IMLS)

ER -