TY - JOUR
T1 - Pointwise-in-Time Diagnostics for Reinforcement Learning During Training and Runtime
AU - Brindise, Noel
AU - Posada-Moreno, Andres Felipe
AU - Langbort, Cedric
AU - Trimpe, Sebastian
N1 - This research was funded in part by a National Defense Science and Engineering Graduate Fellowship. It was also funded in part by an Advanced Research Opportunities Program (AROP) scholarship from the RWTH Aachen University and by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany\u2019s Excellence Strategy \u2013 EXC-2023 Internet of Production \u2013 390621612.
PY - 2024
Y1 - 2024
N2 - Explainable AI Planning (XAIP), a subfield of xAI, offers a variety of methods to interpret the behavior of autonomous systems. A recent “pointwise-in-time” explanation method, called Rule Status Assessment (RSA), characterizes an agent’s behavior at individual time steps in a trajectory using linear temporal logic (LTL) rules. In this work, RSA is applied for the first time in a reinforcement learning (RL) context. We first demonstrate RSA diagnostics as a substantial supplement to the basic RL reward curve, tracking whether and when specified subtasks are accomplished. We then introduce a novel “Interactive RSA” which provides the user with detailed diagnostic information automatically at any desired point in a trajectory. We apply RSA to an advanced agent at runtime and show that RSA and its novel interactive variant constitute a promising step towards explainable RL.
AB - Explainable AI Planning (XAIP), a subfield of xAI, offers a variety of methods to interpret the behavior of autonomous systems. A recent “pointwise-in-time” explanation method, called Rule Status Assessment (RSA), characterizes an agent’s behavior at individual time steps in a trajectory using linear temporal logic (LTL) rules. In this work, RSA is applied for the first time in a reinforcement learning (RL) context. We first demonstrate RSA diagnostics as a substantial supplement to the basic RL reward curve, tracking whether and when specified subtasks are accomplished. We then introduce a novel “Interactive RSA” which provides the user with detailed diagnostic information automatically at any desired point in a trajectory. We apply RSA to an advanced agent at runtime and show that RSA and its novel interactive variant constitute a promising step towards explainable RL.
KW - Explainable AI Planning
KW - Explainable Reinforcement Learning
KW - Linear Temporal Logic
KW - Markov Decision Processes
UR - http://www.scopus.com/inward/record.url?scp=85203704496&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85203704496&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85203704496
SN - 2640-3498
VL - 242
SP - 694
EP - 706
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 6th Annual Learning for Dynamics and Control Conference, L4DC 2024
Y2 - 15 July 2024 through 17 July 2024
ER -