Dynamic assortment optimization with changing contextual information

Xi Chen, Yining Wang, Yuan Zhou

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, we study the dynamic assortment optimization problem over a finite selling season of length T. At each time period, the seller offers an arriving customer an assort- ment of substitutable products under a cardinality constraint, and the customer makes the purchase among offered products according to a discrete choice model. Most existing work associates each product with a real-valued fixed mean utility and assumes a multinomial logit choice (MNL) model. In many practical applications, feature/contextual information of products is readily available. In this paper, we incorporate the feature information by assuming a linear relationship between the mean utility and the feature. In addition, we allow the feature information of products to change over time so that the underlying choice model can also be non-stationary. To solve the dynamic assortment optimization under this changing contextual MNL model, we need to simultaneously learn the underlying unknown coefficient and make the decision on the assortment. To this end, we develop an upper confidence bound (UCB) based policy and establish the regret bound on the order of O (d √ T), where d is the dimension of the feature and O suppresses logarithmic dependence. We further establish a lower bound Δ(d √ T/K), where K is the cardinality constraint of an offered assortment, which is usually small. When K is a constant, our policy is optimal up to logarithmic factors. In the exploitation phase of the UCB algorithm, we need to solve a combinatorial optimization problem for assortment optimization based on the learned infor- mation. We further develop an approximation algorithm and an efficient greedy heuristic. The effectiveness of the proposed policy is further demonstrated by our numerical studies.

Original languageEnglish (US)
JournalJournal of Machine Learning Research
Volume21
StatePublished - Oct 2020

Keywords

  • Bandit learning
  • Contextual information
  • Dynamic assortment optimization
  • Regret analysis
  • Upper confidence bounds

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Dynamic assortment optimization with changing contextual information'. Together they form a unique fingerprint.

Cite this