Controlling a Markov decision process with an abrupt change in the transition kernel

Research output: Contribution to journalArticlepeer-review

Abstract

We consider the control of a Markov decision process (MDP) that undergoes an abrupt change in its transition kernel (mode). We formulate the problem of minimizing regret under control switching based on mode change detection, compared to a mode-observing controller, as an optimal stopping problem. Using a sequence of approximations, we reduce it to a quickest change detection (QCD) problem with Markovian data, for which we characterize a state-dependent threshold-type optimal change detection policy. Numerical experiments illustrate various properties of our control-switching policy.

Original languageEnglish (US)
Pages (from-to)325-345
Number of pages21
JournalApplied and Computational Mathematics
Volume23
Issue number3 Special Issue
DOIs
StatePublished - 2024

Keywords

  • Change Detection
  • Detection Policy
  • Optimal Change
  • Optimal Stopping
  • Piecewise Stationary Environment
  • Sequence Approximations
  • Switched Markov Decision Process

ASJC Scopus subject areas

  • Computational Mathematics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Controlling a Markov decision process with an abrupt change in the transition kernel'. Together they form a unique fingerprint.

Cite this