TY - JOUR
T1 - Near-optimal algorithms for piecewise-stationary cascading bandits
AU - Wang, Lingda
AU - Zhou, Huozhi
AU - Li, Bingcong
AU - Varshney, Lav R.
AU - Zhao, Zhizhen
N1 - Funding Information:
1indicates equal contributions. This work has been supported by the IBM-Illinois Center for Cognitive Computing Systems Research, and the Alfred P. Sloan Foundation.
Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - Cascading bandit (CB) is a popular model for web search and online advertising. However, the stationary CB model may be too simple to cope with real-world problems, where user preferences may change over time. Considering piecewise-stationary environments, two efficient algorithms, GLRT-CascadeUCB and GLRT-CascadeKL-UCB, are developed. Comparing with existing works, the proposed algorithms: i) are free of change-point-dependent information for choosing parameters; ii) have fewer tuning parameters; iii) improve regret upper bounds. We also show that the proposed algorithms are optimal up to logarithm terms by deriving a minimax lower bound Ω(√NLT) for piecewise-stationary CB. The efficiency of the proposed algorithms is validated through numerical tests on a real-world benchmark dataset.
AB - Cascading bandit (CB) is a popular model for web search and online advertising. However, the stationary CB model may be too simple to cope with real-world problems, where user preferences may change over time. Considering piecewise-stationary environments, two efficient algorithms, GLRT-CascadeUCB and GLRT-CascadeKL-UCB, are developed. Comparing with existing works, the proposed algorithms: i) are free of change-point-dependent information for choosing parameters; ii) have fewer tuning parameters; iii) improve regret upper bounds. We also show that the proposed algorithms are optimal up to logarithm terms by deriving a minimax lower bound Ω(√NLT) for piecewise-stationary CB. The efficiency of the proposed algorithms is validated through numerical tests on a real-world benchmark dataset.
KW - Cascading bandits
KW - Change-point detection
KW - Non-stationary bandits
KW - Online learning
UR - http://www.scopus.com/inward/record.url?scp=85115080447&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115080447&partnerID=8YFLogxK
U2 - 10.1109/ICASSP39728.2021.9414506
DO - 10.1109/ICASSP39728.2021.9414506
M3 - Conference article
AN - SCOPUS:85115080447
SN - 1520-6149
VL - 2021-June
SP - 3365
EP - 3369
JO - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
JF - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
T2 - 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021
Y2 - 6 June 2021 through 11 June 2021
ER -