Abstract
In this paper we consider the dynamic assortment selection problem under an uncapacitated multinomial-logit (MNL) model. By carefully analyzing a revenue potential function, we show that a trisection based algorithm achieves an item-independent regret bound of O(√T log log T), which matches information theoretical lower bounds up to iterated logarithmic terms. Our proof technique draws tools from the unimodal/convex bandit literature as well as adaptive confidence parameters in minimax multi-armed bandit problems.
Original language | English (US) |
---|---|
Pages (from-to) | 3101-3110 |
Number of pages | 10 |
Journal | Advances in Neural Information Processing Systems |
Volume | 2018-December |
State | Published - 2018 |
Event | 32nd Conference on Neural Information Processing Systems, NeurIPS 2018 - Montreal, Canada Duration: Dec 2 2018 → Dec 8 2018 |
Keywords
- Dynamic assortment planning
- Multinomial logit choice model
- Regret analysis
- Trisection algorithm
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Signal Processing