TY - JOUR
T1 - Learn and Control While Switching
T2 - Guaranteed Stability and Sublinear Regret
AU - Chekan, Jafar Abbaszadeh
AU - Langbort, Cedric
N1 - This work was supported in part by NSF under Award #2007604 and in part by Grant from the C3AI Digital Technology Institute. Recommended by Associate Editor T. Faulwasser.
This paragraph of the first footnote will contain the date on which you submitted your paper for review. It will also contain support information, including sponsor and financial support acknowledgment. For example, \u201CThis work was supported in part by the U.S. Department of Commerce under Grant BS123456.\u201D J. A. Chekan and C. Langbort (emails: jafar2 & [email protected]) are with the Coordinated Science Laboratory and the Department of Aerospace Engineering at the University of Illinois at Urbana-Champaign (UIUC).
PY - 2024
Y1 - 2024
N2 - Overactuated systems often make it possible to achieve specific performances by switching between different subsets of actuators. However, when the system parameters are unknown, transferring authority to different subsets of actuators is challenging due to stability and performance efficiency concerns. This article presents an efficient algorithm to tackle the so-called "learn and control while switching between different actuating modes"problem in the linear quadratic setting. Our proposed strategy is constructed upon optimism in the face of uncertainty (OFU)-based algorithm equipped with a projection toolbox to keep the algorithm efficient, regretwise. Along the way, we derive an optimum duration for the warm-up phase, thanks to the existence of a stabilizing neighborhood. The stability of the switched system is also guaranteed by designing a minimum average dwell time. The proposed strategy is proved to have a regret bound of OnsT in horizon T with ns number of switches, provably outperforming naively applying the basic OFU algorithm.
AB - Overactuated systems often make it possible to achieve specific performances by switching between different subsets of actuators. However, when the system parameters are unknown, transferring authority to different subsets of actuators is challenging due to stability and performance efficiency concerns. This article presents an efficient algorithm to tackle the so-called "learn and control while switching between different actuating modes"problem in the linear quadratic setting. Our proposed strategy is constructed upon optimism in the face of uncertainty (OFU)-based algorithm equipped with a projection toolbox to keep the algorithm efficient, regretwise. Along the way, we derive an optimum duration for the warm-up phase, thanks to the existence of a stabilizing neighborhood. The stability of the switched system is also guaranteed by designing a minimum average dwell time. The proposed strategy is proved to have a regret bound of OnsT in horizon T with ns number of switches, provably outperforming naively applying the basic OFU algorithm.
KW - Overactuated system
KW - regret
KW - reinforcement learning
KW - switched system
UR - http://www.scopus.com/inward/record.url?scp=85200824055&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85200824055&partnerID=8YFLogxK
U2 - 10.1109/TAC.2024.3440348
DO - 10.1109/TAC.2024.3440348
M3 - Article
AN - SCOPUS:85200824055
SN - 0018-9286
VL - 69
SP - 8433
EP - 8448
JO - IEEE Transactions on Automatic Control
JF - IEEE Transactions on Automatic Control
IS - 12
ER -