TY - JOUR
T1 - Thresholding bandit with optimal aggregate regret
AU - Tao, Chao
AU - Blanco, Saúl A.
AU - Peng, Jian
AU - Zhou, Yuan
N1 - Publisher Copyright:
© 2019 Neural information processing systems foundation. All rights reserved.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2019
Y1 - 2019
N2 - We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold ?, with a fixed budget of T trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal. We also provide comprehensive empirical results to demonstrate the algorithm's superior performance over existing algorithms under a variety of different scenarios.
AB - We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold ?, with a fixed budget of T trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal. We also provide comprehensive empirical results to demonstrate the algorithm's superior performance over existing algorithms under a variety of different scenarios.
UR - http://www.scopus.com/inward/record.url?scp=85090176057&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090176057&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85090176057
VL - 32
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
SN - 1049-5258
T2 - 33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019
Y2 - 8 December 2019 through 14 December 2019
ER -