Abstract
We consider a partially observable lost-sales inventory system, in which the inventory level is observed only when it reaches zero. We use the vanishing discount factor approach to prove the existence of a stationary optimal policy for the average cost minimization. As our main methodological contribution, we provide a way to verify the key condition of the vanishing discount factor approach-the uniform boundedness of the relative discounted value function. To accomplish that, we construct a valid policy, which, in a certain sense, “copies” the actions of another policy for the process with a different initial state. To the best of our knowledge, this paper is the first one on partially observable inventory models under the average cost criterion.
Original language | English (US) |
---|---|
Pages (from-to) | 2390-2396 |
Number of pages | 7 |
Journal | Operations Research |
Volume | 71 |
Issue number | 6 |
DOIs | |
State | Published - Nov 1 2023 |
Keywords
- Markov decision process
- average cost
- lost sales
- optimality inequality
- partial observations
ASJC Scopus subject areas
- Computer Science Applications
- Management Science and Operations Research