TY - JOUR
T1 - Regret bounds for online-learning-based linear quadratic control under database attacks
AU - Abbaszadeh Chekan, Jafar
AU - Langbort, Cedric
N1 - Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2023/5
Y1 - 2023/5
N2 - This paper is concerned with understanding and countering the effects of database attacks on a learning-based linear quadratic adaptive controller. This attack targets neither sensors nor actuators, but just poisons the learning algorithm and parameter estimator that is part of the regulation scheme. We focus on the adaptive optimal control algorithm introduced by Abbasi-Yadkori and Szepesvari and provide regret analysis in the presence of attacks as well as modifications that mitigate their effects. A core step of this algorithm is the self-regularized on-line least squares estimation, which determines a tight confidence set around the true parameters of the system with high probability. In the absence of malicious data injection, this set provides an appropriate estimate of parameters for the aim of control design. However, in the presence of attack, this confidence set is not reliable anymore. Hence, we first tackle the question of how to adjust the confidence set so that it can compensate for the effect of the poisonous data. Then, we quantify the deleterious effect of this type of attack on the optimality of control policy by bounding regret of the closed-loop system under attack.
AB - This paper is concerned with understanding and countering the effects of database attacks on a learning-based linear quadratic adaptive controller. This attack targets neither sensors nor actuators, but just poisons the learning algorithm and parameter estimator that is part of the regulation scheme. We focus on the adaptive optimal control algorithm introduced by Abbasi-Yadkori and Szepesvari and provide regret analysis in the presence of attacks as well as modifications that mitigate their effects. A core step of this algorithm is the self-regularized on-line least squares estimation, which determines a tight confidence set around the true parameters of the system with high probability. In the absence of malicious data injection, this set provides an appropriate estimate of parameters for the aim of control design. However, in the presence of attack, this confidence set is not reliable anymore. Hence, we first tackle the question of how to adjust the confidence set so that it can compensate for the effect of the poisonous data. Then, we quantify the deleterious effect of this type of attack on the optimality of control policy by bounding regret of the closed-loop system under attack.
KW - Database attacks
KW - Learning-based control
KW - Regret bound
UR - http://www.scopus.com/inward/record.url?scp=85148097754&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85148097754&partnerID=8YFLogxK
U2 - 10.1016/j.automatica.2023.110876
DO - 10.1016/j.automatica.2023.110876
M3 - Article
AN - SCOPUS:85148097754
SN - 0005-1098
VL - 151
JO - Automatica
JF - Automatica
M1 - 110876
ER -