TY - GEN
T1 - Deception in Optimal Control
AU - Ornik, Melkior
AU - Topcu, Ufuk
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - In this paper, we consider an adversarial scenario where one agent seeks to achieve an objective, and its adversary seeks to learn the agent's intentions and prevent the agent from achieving its objective. Thus, the agent has an incentive to try to deceive the adversary about its intentions, while at the same time working to achieve its objective. The primary contribution of this paper is to introduce a mathematically rigorous framework for the notion of deception within the context of optimal control. The central notion introduced in the paper is that of a belief-induced reward: a reward dependent not only on the agent's state and action, but also adversary's beliefs. While the paper focuses on the setting of Markov decision processes, the proposed framework allows for deception to be defined in an arbitrary control system endowed with a reward function. Furthermore, we discuss design of optimal deceptive strategies, under possible additional specifications on agent's behavior or uncertainty in agent's knowledge about the adversary's learning process, and show that such design reduces to known problems in control design on partially observable or uncertain Markov decision processes. The notion of deception is illustrated on a running example of 'cops and deceptive robbers'. We show that an optimal deceptive strategy of this example, designed using the theory introduced in the paper, follows the intuitive idea of how to deceive an adversary in such a scenario.
AB - In this paper, we consider an adversarial scenario where one agent seeks to achieve an objective, and its adversary seeks to learn the agent's intentions and prevent the agent from achieving its objective. Thus, the agent has an incentive to try to deceive the adversary about its intentions, while at the same time working to achieve its objective. The primary contribution of this paper is to introduce a mathematically rigorous framework for the notion of deception within the context of optimal control. The central notion introduced in the paper is that of a belief-induced reward: a reward dependent not only on the agent's state and action, but also adversary's beliefs. While the paper focuses on the setting of Markov decision processes, the proposed framework allows for deception to be defined in an arbitrary control system endowed with a reward function. Furthermore, we discuss design of optimal deceptive strategies, under possible additional specifications on agent's behavior or uncertainty in agent's knowledge about the adversary's learning process, and show that such design reduces to known problems in control design on partially observable or uncertain Markov decision processes. The notion of deception is illustrated on a running example of 'cops and deceptive robbers'. We show that an optimal deceptive strategy of this example, designed using the theory introduced in the paper, follows the intuitive idea of how to deceive an adversary in such a scenario.
UR - http://www.scopus.com/inward/record.url?scp=85062888871&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85062888871&partnerID=8YFLogxK
U2 - 10.1109/ALLERTON.2018.8635871
DO - 10.1109/ALLERTON.2018.8635871
M3 - Conference contribution
AN - SCOPUS:85062888871
T3 - 2018 56th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2018
SP - 821
EP - 828
BT - 2018 56th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 56th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2018
Y2 - 2 October 2018 through 5 October 2018
ER -