TY - GEN
T1 - Convergence and optimality of policy gradient primal-dual method for constrained Markov decision processes
AU - Ding, Dongsheng
AU - Zhang, Kaiqing
AU - Basar, Tamer
AU - Jovanovic, Mihailo R.
N1 - Publisher Copyright:
© 2022 American Automatic Control Council.
PY - 2022
Y1 - 2022
N2 - We study constrained Markov decision processes with finite state and action spaces. The optimal solution of a discounted infinite-horizon optimal control problem is obtained using a Policy Gradient Primal-Dual (PG-PD) method without any policy parametrization. This method updates the primal variable via projected policy gradient ascent and the dual variable via projected sub-gradient descent. Despite the lack of concavity of the constrained maximization problem in policy space, we exploit the underlying structure to provide non-asymptotic global convergence guarantees with sublinear rates in terms of both the optimality gap and the constraint violation. Furthermore, for a sample-based PG-PD algorithm, we quantify sample complexity and offer computational experiments to demonstrate the effectiveness of our results.
AB - We study constrained Markov decision processes with finite state and action spaces. The optimal solution of a discounted infinite-horizon optimal control problem is obtained using a Policy Gradient Primal-Dual (PG-PD) method without any policy parametrization. This method updates the primal variable via projected policy gradient ascent and the dual variable via projected sub-gradient descent. Despite the lack of concavity of the constrained maximization problem in policy space, we exploit the underlying structure to provide non-asymptotic global convergence guarantees with sublinear rates in terms of both the optimality gap and the constraint violation. Furthermore, for a sample-based PG-PD algorithm, we quantify sample complexity and offer computational experiments to demonstrate the effectiveness of our results.
UR - http://www.scopus.com/inward/record.url?scp=85138493991&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85138493991&partnerID=8YFLogxK
U2 - 10.23919/ACC53348.2022.9867805
DO - 10.23919/ACC53348.2022.9867805
M3 - Conference contribution
AN - SCOPUS:85138493991
T3 - Proceedings of the American Control Conference
SP - 2851
EP - 2856
BT - 2022 American Control Conference, ACC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 American Control Conference, ACC 2022
Y2 - 8 June 2022 through 10 June 2022
ER -