The explore-exploit dilemma in nonstationary decision making under uncertainty

Allan Axelrod, Girish Chowdhary

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

It is often assumed that autonomous systems are operating in environments that may be described by a stationary (time-invariant) environment. However, real-world environments are often nonstationary (time-varying), where the underlying phenomena changes in time, so stationary approximations of the nonstationary environment may quickly lose relevance. Here, two approaches are presented and applied in the context of reinforcement learning in nonstationary environments. In Sect. 2.2, the first approach leverages reinforcement learning in the presence of a changing reward-model. In particular, a functional termed the Fog-of-War is used to drive exploration which results in the timely discovery of new models in nonstationary environments. In Sect. 2.3, the Fog-of-War functional is adapted in real-time to reflect the heterogeneous information content of a real-world environment; this is critically important for the use of the approach in Sect. 2.2 in realworld environments.

Original languageEnglish (US)
Title of host publicationStudies in Systems, Decision and Control
PublisherSpringer
Pages29-52
Number of pages24
DOIs
StatePublished - 2015
Externally publishedYes

Publication series

NameStudies in Systems, Decision and Control
Volume42
ISSN (Print)2198-4182
ISSN (Electronic)2198-4190

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Control and Systems Engineering
  • Automotive Engineering
  • Social Sciences (miscellaneous)
  • Economics, Econometrics and Finance (miscellaneous)
  • Control and Optimization
  • Decision Sciences (miscellaneous)

Fingerprint

Dive into the research topics of 'The explore-exploit dilemma in nonstationary decision making under uncertainty'. Together they form a unique fingerprint.

Cite this