Expert Selection in High-Dimensional Markov Decision Processes

Vicenc Rubies-Royo, Eric Mazumdar, Roy Dong, Claire Tomlin, S. Shankar Sastry

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this work we present a multi-armed bandit framework for online expert selection in Markov decision processes and demonstrate its use in high-dimensional settings. Our method takes a set of candidate expert policies and switches between them to rapidly identify the best performing expert using a variant of the classical upper confidence bound algorithm, thus ensuring low regret in the overall performance of the system. This is useful in applications where several expert policies may be available, and one needs to be selected at run-time for the underlying environment.

Original languageEnglish (US)
Title of host publication2020 59th IEEE Conference on Decision and Control, CDC 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3604-3610
Number of pages7
ISBN (Electronic)9781728174471
DOIs
StatePublished - Dec 14 2020
Externally publishedYes
Event59th IEEE Conference on Decision and Control, CDC 2020 - Virtual, Jeju Island, Korea, Republic of
Duration: Dec 14 2020Dec 18 2020

Publication series

NameProceedings of the IEEE Conference on Decision and Control
Volume2020-December
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370

Conference

Conference59th IEEE Conference on Decision and Control, CDC 2020
Country/TerritoryKorea, Republic of
CityVirtual, Jeju Island
Period12/14/2012/18/20

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Modeling and Simulation
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Expert Selection in High-Dimensional Markov Decision Processes'. Together they form a unique fingerprint.

Cite this