TY - GEN
T1 - Optimal Deceptive and Reference Policies for Supervisory Control
AU - Karabag, Mustafa O.
AU - Ornik, Melkior
AU - Topcu, Ufuk
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/12
Y1 - 2019/12
N2 - The use of deceptive strategies is important for an agent that attempts not to reveal his intentions in an adversarial environment. We consider a setting in which a supervisor provides a reference policy and expects an agent to follow the reference policy and perform a task. The agent may instead follow a different, deceptive policy to achieve a different task. We model the environment and the behavior of the agent with a Markov decision process, represent the tasks of the agent and the supervisor with linear temporal logic formulae, and study the synthesis of optimal deceptive policies for such agents. We also study the synthesis of optimal reference policies that prevents deceptive strategies of the agent and achieves the supervisor's task with high probability. We show that the synthesis of deceptive policies has a convex optimization problem formulation, while the synthesis of reference policies requires solving a nonconvex optimization problem.
AB - The use of deceptive strategies is important for an agent that attempts not to reveal his intentions in an adversarial environment. We consider a setting in which a supervisor provides a reference policy and expects an agent to follow the reference policy and perform a task. The agent may instead follow a different, deceptive policy to achieve a different task. We model the environment and the behavior of the agent with a Markov decision process, represent the tasks of the agent and the supervisor with linear temporal logic formulae, and study the synthesis of optimal deceptive policies for such agents. We also study the synthesis of optimal reference policies that prevents deceptive strategies of the agent and achieves the supervisor's task with high probability. We show that the synthesis of deceptive policies has a convex optimization problem formulation, while the synthesis of reference policies requires solving a nonconvex optimization problem.
UR - http://www.scopus.com/inward/record.url?scp=85082437142&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082437142&partnerID=8YFLogxK
U2 - 10.1109/CDC40024.2019.9029607
DO - 10.1109/CDC40024.2019.9029607
M3 - Conference contribution
AN - SCOPUS:85082437142
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 1323
EP - 1330
BT - 2019 IEEE 58th Conference on Decision and Control, CDC 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 58th IEEE Conference on Decision and Control, CDC 2019
Y2 - 11 December 2019 through 13 December 2019
ER -