TY - GEN
T1 - Optimal Runtime Assurance via Reinforcement Learning
AU - Miller, Kristina
AU - Zeitler, Christopher K.
AU - Shen, William
AU - Hobbs, Kerianne
AU - Schierman, John
AU - Viswanathan, Mahesh
AU - Mitra, Sayan
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - AI and Machine Learning could enhance autonomous systems, provided the risk of safety violations could be mitigated. Specific instances of runtime assurance (RTA) have been successful in safely testing untrusted, learning-enabled controllers, but a general design methodology for RTA remains a challenge. The problem is to create a logic that assures safety by switching to a safety (or backup) controller as needed, while maximizing a performance criteria, such as the utilization of the untrusted controller. Existing RTA design strategies are well-known to be overly conservative and can lead to safety violations. In this paper, we formulate the optimal RTA design problem and present an approach for solving it. Our approach relies on reward shaping and reinforcement learning. It can guarantee that safety or other hard constraints are met and leverages machine learning technologies for scalability. We have implemented this algorithm and present extensive experimental results on challenging scenarios involving aircraft models, multi-agent systems, realistic simulators, and complex safety requirements. Our experimental results suggest that this RTA design approach can be effective in guaranteeing hard safety constraints while increasing utilization over existing approaches.
AB - AI and Machine Learning could enhance autonomous systems, provided the risk of safety violations could be mitigated. Specific instances of runtime assurance (RTA) have been successful in safely testing untrusted, learning-enabled controllers, but a general design methodology for RTA remains a challenge. The problem is to create a logic that assures safety by switching to a safety (or backup) controller as needed, while maximizing a performance criteria, such as the utilization of the untrusted controller. Existing RTA design strategies are well-known to be overly conservative and can lead to safety violations. In this paper, we formulate the optimal RTA design problem and present an approach for solving it. Our approach relies on reward shaping and reinforcement learning. It can guarantee that safety or other hard constraints are met and leverages machine learning technologies for scalability. We have implemented this algorithm and present extensive experimental results on challenging scenarios involving aircraft models, multi-agent systems, realistic simulators, and complex safety requirements. Our experimental results suggest that this RTA design approach can be effective in guaranteeing hard safety constraints while increasing utilization over existing approaches.
KW - reinforcement learning
KW - runtime assurance
KW - safety for cps
UR - http://www.scopus.com/inward/record.url?scp=85198538372&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85198538372&partnerID=8YFLogxK
U2 - 10.1109/ICCPS61052.2024.00013
DO - 10.1109/ICCPS61052.2024.00013
M3 - Conference contribution
AN - SCOPUS:85198538372
T3 - Proceedings - 15th ACM/IEEE International Conference on Cyber-Physical Systems, ICCPS 2024
SP - 67
EP - 76
BT - Proceedings - 15th ACM/IEEE International Conference on Cyber-Physical Systems, ICCPS 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th Annual ACM/IEEE International Conference on Cyber-Physical Systems, ICCPS 2024
Y2 - 13 May 2024 through 16 May 2024
ER -