TY - GEN
T1 - Policy Search in Infinite-Horizon Discounted Reinforcement Learning
T2 - 53rd Annual Conference on Information Sciences and Systems, CISS 2019
AU - Zhang, Kaiqing
AU - Koppel, Alec
AU - Zhu, Hao
AU - Başar, Tamer
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/4/16
Y1 - 2019/4/16
N2 - In reinforcement learning (RL), an agent moving through a state space, selects actions which cause a transition to a new state according to an unknown Markov transition density that depends on the previous state and action. After each transition, a reward that informs the quality of being in a particular state is revealed. The goal is to select the action sequence to maximize the long term accumulation of rewards, or value. We focus on the case where the policy that determines how actions are chosen is a fixed stationary distribution parameterized by a vector, the problem horizon is infinite, and the states and actions belong to continuous Euclidean subsets.
AB - In reinforcement learning (RL), an agent moving through a state space, selects actions which cause a transition to a new state according to an unknown Markov transition density that depends on the previous state and action. After each transition, a reward that informs the quality of being in a particular state is revealed. The goal is to select the action sequence to maximize the long term accumulation of rewards, or value. We focus on the case where the policy that determines how actions are chosen is a fixed stationary distribution parameterized by a vector, the problem horizon is infinite, and the states and actions belong to continuous Euclidean subsets.
UR - http://www.scopus.com/inward/record.url?scp=85065170617&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065170617&partnerID=8YFLogxK
U2 - 10.1109/CISS.2019.8693017
DO - 10.1109/CISS.2019.8693017
M3 - Conference contribution
AN - SCOPUS:85065170617
T3 - 2019 53rd Annual Conference on Information Sciences and Systems, CISS 2019
BT - 2019 53rd Annual Conference on Information Sciences and Systems, CISS 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 20 March 2019 through 22 March 2019
ER -