TY - GEN
T1 - Statistically Model Checking PCTL Specifications on Markov Decision Processes via Reinforcement Learning
AU - Wang, Yu
AU - Roohi, Nima
AU - West, Matthew
AU - Viswanathan, Mahesh
AU - Dullerud, Geir E.
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/12/14
Y1 - 2020/12/14
N2 - Probabilistic Computation Tree Logic (PCTL) is frequently used to formally specify control objectives such as probabilistic reachability and safety. In this work, we focus on model checking PCTL specifications statistically on Markov Decision Processes (MDPs) by sampling, e.g., checking whether there exists a feasible policy such that the probability of reaching certain goal states is greater than a threshold. We use reinforcement learning to search for such a feasible policy for PCTL specifications, and then develop a statistical model checking (SMC) method with provable guarantees on its error. Specifically, we first use upper-confidence-bound (UCB) based Q-learning to design an SMC algorithm for bounded-time PCTL specifications, and then extend this algorithm to unbounded-time specifications by identifying a proper truncation time by checking the PCTL specification and its negation at the same time. Finally, we evaluate the proposed method by case studies.
AB - Probabilistic Computation Tree Logic (PCTL) is frequently used to formally specify control objectives such as probabilistic reachability and safety. In this work, we focus on model checking PCTL specifications statistically on Markov Decision Processes (MDPs) by sampling, e.g., checking whether there exists a feasible policy such that the probability of reaching certain goal states is greater than a threshold. We use reinforcement learning to search for such a feasible policy for PCTL specifications, and then develop a statistical model checking (SMC) method with provable guarantees on its error. Specifically, we first use upper-confidence-bound (UCB) based Q-learning to design an SMC algorithm for bounded-time PCTL specifications, and then extend this algorithm to unbounded-time specifications by identifying a proper truncation time by checking the PCTL specification and its negation at the same time. Finally, we evaluate the proposed method by case studies.
UR - http://www.scopus.com/inward/record.url?scp=85099883810&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099883810&partnerID=8YFLogxK
U2 - 10.1109/CDC42340.2020.9303982
DO - 10.1109/CDC42340.2020.9303982
M3 - Conference contribution
AN - SCOPUS:85099883810
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 1392
EP - 1397
BT - 2020 59th IEEE Conference on Decision and Control, CDC 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 59th IEEE Conference on Decision and Control, CDC 2020
Y2 - 14 December 2020 through 18 December 2020
ER -