Convergence and Iteration Complexity of Policy Gradient Method for Infinite-horizon Reinforcement Learning

Kaiqing Zhang, Alec Koppel, Hao Zhu, Tamer Basar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fingerprint

Dive into the research topics of 'Convergence and Iteration Complexity of Policy Gradient Method for Infinite-horizon Reinforcement Learning'. Together they form a unique fingerprint.

Keyphrases

Mathematics