TY - CHAP
T1 - The explore-exploit dilemma in nonstationary decision making under uncertainty
AU - Axelrod, Allan
AU - Chowdhary, Girish
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - It is often assumed that autonomous systems are operating in environments that may be described by a stationary (time-invariant) environment. However, real-world environments are often nonstationary (time-varying), where the underlying phenomena changes in time, so stationary approximations of the nonstationary environment may quickly lose relevance. Here, two approaches are presented and applied in the context of reinforcement learning in nonstationary environments. In Sect. 2.2, the first approach leverages reinforcement learning in the presence of a changing reward-model. In particular, a functional termed the Fog-of-War is used to drive exploration which results in the timely discovery of new models in nonstationary environments. In Sect. 2.3, the Fog-of-War functional is adapted in real-time to reflect the heterogeneous information content of a real-world environment; this is critically important for the use of the approach in Sect. 2.2 in realworld environments.
AB - It is often assumed that autonomous systems are operating in environments that may be described by a stationary (time-invariant) environment. However, real-world environments are often nonstationary (time-varying), where the underlying phenomena changes in time, so stationary approximations of the nonstationary environment may quickly lose relevance. Here, two approaches are presented and applied in the context of reinforcement learning in nonstationary environments. In Sect. 2.2, the first approach leverages reinforcement learning in the presence of a changing reward-model. In particular, a functional termed the Fog-of-War is used to drive exploration which results in the timely discovery of new models in nonstationary environments. In Sect. 2.3, the Fog-of-War functional is adapted in real-time to reflect the heterogeneous information content of a real-world environment; this is critically important for the use of the approach in Sect. 2.2 in realworld environments.
UR - http://www.scopus.com/inward/record.url?scp=85028919187&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85028919187&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-26327-4_2
DO - 10.1007/978-3-319-26327-4_2
M3 - Chapter
AN - SCOPUS:85028919187
T3 - Studies in Systems, Decision and Control
SP - 29
EP - 52
BT - Studies in Systems, Decision and Control
PB - Springer
ER -