Abstract

Cascading bandit (CB) is a popular model for web search and online advertising. However, the stationary CB model may be too simple to cope with real-world problems, where user preferences may change over time. Considering piecewise-stationary environments, two efficient algorithms, GLRT-CascadeUCB and GLRT-CascadeKL-UCB, are developed. Comparing with existing works, the proposed algorithms: i) are free of change-point-dependent information for choosing parameters; ii) have fewer tuning parameters; iii) improve regret upper bounds. We also show that the proposed algorithms are optimal up to logarithm terms by deriving a minimax lower bound Ω(NLT) for piecewise-stationary CB. The efficiency of the proposed algorithms is validated through numerical tests on a real-world benchmark dataset.

Original languageEnglish (US)
Pages (from-to)3365-3369
Number of pages5
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2021-June
DOIs
StatePublished - 2021
Event2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada
Duration: Jun 6 2021Jun 11 2021

Keywords

  • Cascading bandits
  • Change-point detection
  • Non-stationary bandits
  • Online learning

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Near-optimal algorithms for piecewise-stationary cascading bandits'. Together they form a unique fingerprint.

Cite this