TY - GEN
T1 - Model-Free μ Synthesis via Adversarial Reinforcement Learning
AU - Keivan, Darioush
AU - Havens, Aaron
AU - Seiler, Peter
AU - Dullerud, Geir
AU - Hu, Bin
N1 - ACKNOWLEDGMENT D. Keivan and G. Dullerud are partially funded by NSF under the grant ECCS 19-32735. A. Havens and B. Hu are generously supported by the NSF award CAREER-2048168 and the 2020 Amazon research award. P. Seiler is supported by the US ONR grant N00014-18-1-2209.
PY - 2022
Y1 - 2022
N2 - Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important robust control problem, namely μ synthesis. We build a connection between robust adversarial RL and μ synthesis, and develop a model-free version of the well-known DK-iteration for solving state-feedback μ synthesis with static D-scaling. In the proposed algorithm, the K step mimics the classical central path algorithm via incorporating a recently-developed double-loop adversarial RL method as a subroutine, and the D step is based on model-free finite difference approximation. Extensive numerical study is also presented to demonstrate the utility of our proposed model-free algorithm. Our study sheds new light on the connections between adversarial RL and robust control.
AB - Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important robust control problem, namely μ synthesis. We build a connection between robust adversarial RL and μ synthesis, and develop a model-free version of the well-known DK-iteration for solving state-feedback μ synthesis with static D-scaling. In the proposed algorithm, the K step mimics the classical central path algorithm via incorporating a recently-developed double-loop adversarial RL method as a subroutine, and the D step is based on model-free finite difference approximation. Extensive numerical study is also presented to demonstrate the utility of our proposed model-free algorithm. Our study sheds new light on the connections between adversarial RL and robust control.
UR - http://www.scopus.com/inward/record.url?scp=85138492337&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85138492337&partnerID=8YFLogxK
U2 - 10.23919/ACC53348.2022.9867674
DO - 10.23919/ACC53348.2022.9867674
M3 - Conference contribution
AN - SCOPUS:85138492337
T3 - Proceedings of the American Control Conference
SP - 3335
EP - 3341
BT - 2022 American Control Conference, ACC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 American Control Conference, ACC 2022
Y2 - 8 June 2022 through 10 June 2022
ER -