TY - GEN
T1 - On the Convergence of Natural Policy Gradient and Mirror Descent-Like Policy Methods for Average-Reward MDPs
AU - Murthy, Yashaswini
AU - Srikant, R.
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - It is now well known that Natural Policy Gradient (NPG) globally converges for discounted-reward MDPs in the tabular setting, with perfect value function estimates. However, the result cannot be directly used to obtain a corresponding convergence result for average-reward MDPs by letting the discount factor tend to one. In this paper, we prove that NPG also converges for average-reward MDPs in which each policy leads to an irreducible Markov chain. Since NPG can also be interpreted as a mirror descent based policy method, we then discuss extensions to non-tabular settings for mirror descent-based methods.
AB - It is now well known that Natural Policy Gradient (NPG) globally converges for discounted-reward MDPs in the tabular setting, with perfect value function estimates. However, the result cannot be directly used to obtain a corresponding convergence result for average-reward MDPs by letting the discount factor tend to one. In this paper, we prove that NPG also converges for average-reward MDPs in which each policy leads to an irreducible Markov chain. Since NPG can also be interpreted as a mirror descent based policy method, we then discuss extensions to non-tabular settings for mirror descent-based methods.
UR - http://www.scopus.com/inward/record.url?scp=85184798985&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85184798985&partnerID=8YFLogxK
U2 - 10.1109/CDC49753.2023.10383691
DO - 10.1109/CDC49753.2023.10383691
M3 - Conference contribution
AN - SCOPUS:85184798985
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 1979
EP - 1984
BT - 2023 62nd IEEE Conference on Decision and Control, CDC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 62nd IEEE Conference on Decision and Control, CDC 2023
Y2 - 13 December 2023 through 15 December 2023
ER -