A Case Study on the Convergence of Direct Policy Search for Linear Quadratic Gaussian Control

Darioush Keivan, Peter Seiler, Geir Dullerud, H. Bin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Policy optimization has gained renewed attention from the control community, serving as a pivotal link between control theory and reinforcement learning. In the past few years, the global convergence theory of direct policy search on state-feedback linear control benchmarks has been developed. However, it remains difficult to establish the global convergence of policy optimization on the linear quadratic Gaussian (LQG) problem, marked by the presence of suboptimal stationary points and the lack of cost coerciveness. In this paper, we revisit the policy optimization intricacies of LQG via a case study on first-order single-input single-output (SISO) systems. For this case study, while the issue related to suboptimal stationary points can be easily fixed via parameterizing the policy class more carefully, the non-coerciveness of the LQG cost function still poses a substantial obstacle to a straightforward global convergence proof for the policy gradient method. Our contribution, within the scope of this case study, introduces an approach to construct a positive invariant set for the policy gradient flow, addressing the non-coerciveness issue in the global convergence proof. Based on our analysis, the policy gradient flow can be guaranteed to converge to the globally optimal full-order dynamic controller in this particular scenario. In summary, although centered on a specific case study, our work broadens the comprehension of how the absence of coerciveness impacts LQG policy optimization, highlighting inherent complexities.

Original languageEnglish (US)
Title of host publication2024 American Control Conference, ACC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3710-3715
Number of pages6
ISBN (Electronic)9798350382655
DOIs
StatePublished - 2024
Event2024 American Control Conference, ACC 2024 - Toronto, Canada
Duration: Jul 10 2024Jul 12 2024

Publication series

NameProceedings of the American Control Conference
ISSN (Print)0743-1619

Conference

Conference2024 American Control Conference, ACC 2024
Country/TerritoryCanada
CityToronto
Period7/10/247/12/24

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A Case Study on the Convergence of Direct Policy Search for Linear Quadratic Gaussian Control'. Together they form a unique fingerprint.

Cite this