Physics-Model-Regulated Deep Reinforcement Learning Towards Safety & Stability Guarantees

Hongpeng Cao, Yanbing Mao, Lui Sha, Marco Caccamo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep reinforcement learning (DRL) has demonstrated impressive success in solving complex control tasks by synthesizing control policies from data. However, the safety and stability of applying DRL to safety-critical systems remain a primary concern and challenging problem. To address the problem, we propose the Phy-DRL: a novel physics-model-regulated deep reinforcement learning framework. The Phy-DRL is novel in two architectural designs: a physics-model-regulated reward and residual control, which integrates physics-model-based control and data-driven control. The concurrent designs enable the Phy-DRL the mathematically provable safety and stability guarantees. Finally, the effectiveness of the Phy-DRL is validated by an inverted pendulum system. Additionally, the experimental results demonstrate that the Phy-DRL features remarkably accelerated training and enlarged reward.

Original languageEnglish (US)
Title of host publication2023 62nd IEEE Conference on Decision and Control, CDC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages8306-8311
Number of pages6
ISBN (Electronic)9798350301243
DOIs
StatePublished - 2023
Event62nd IEEE Conference on Decision and Control, CDC 2023 - Singapore, Singapore
Duration: Dec 13 2023Dec 15 2023

Publication series

NameProceedings of the IEEE Conference on Decision and Control
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370

Conference

Conference62nd IEEE Conference on Decision and Control, CDC 2023
Country/TerritorySingapore
CitySingapore
Period12/13/2312/15/23

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Modeling and Simulation
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Physics-Model-Regulated Deep Reinforcement Learning Towards Safety & Stability Guarantees'. Together they form a unique fingerprint.

Cite this