TY - JOUR
T1 - Control-Oriented Learning on the Fly
AU - Ornik, Melkior
AU - Carr, Steven
AU - Israel, Arie
AU - Topcu, Ufuk
N1 - Manuscript received August 12, 2019; accepted December 4, 2019. Date of publication December 31, 2019; date of current version October 21, 2020. This work was supported in part by the Air Force Office of Scientific Research under Grant FA9550-19-1-0005 and in part by the Defense Advanced Research Projects Agency under Grant FA8750-19-C-0092. Recommended by Associate Editor F. Wirth. This article is a substantially extended version of [1]. It provides proofs omitted from [1]. Entirely novel material includes Sections IV, VII-B, and VIII, Appendix A, and parts of several other sections. (Corresponding author: Melkior Ornik.) M. Ornik is with the University of Illinois at Urbana–Champaign, Champaign, IL 61801 USA (e-mail: [email protected]).
PY - 2020/11
Y1 - 2020/11
N2 - This article focuses on developing a strategy for control of systems, whose dynamics are almost entirely unknown. This situation arises naturally in a scenario, where a system undergoes a critical failure. In that case, it is imperative to retain the ability to satisfy basic control objectives in order to avert an imminent catastrophe. To deal with limitations on the knowledge of system dynamics, we develop a theory of myopic control. At any given time, myopic control optimizes the current direction of the system trajectory, given solely the knowledge about system dynamics obtained from data until that time. We propose an algorithm that uses small perturbations in the control effort to learn system dynamics around the current system state, while ensuring that the system moves in a nearly optimal direction, and provide bounds for its suboptimality.
AB - This article focuses on developing a strategy for control of systems, whose dynamics are almost entirely unknown. This situation arises naturally in a scenario, where a system undergoes a critical failure. In that case, it is imperative to retain the ability to satisfy basic control objectives in order to avert an imminent catastrophe. To deal with limitations on the knowledge of system dynamics, we develop a theory of myopic control. At any given time, myopic control optimizes the current direction of the system trajectory, given solely the knowledge about system dynamics obtained from data until that time. We propose an algorithm that uses small perturbations in the control effort to learn system dynamics around the current system state, while ensuring that the system moves in a nearly optimal direction, and provide bounds for its suboptimality.
KW - Estimation
KW - learning
KW - optimal control
KW - uncertain systems
UR - http://www.scopus.com/inward/record.url?scp=85095679030&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85095679030&partnerID=8YFLogxK
U2 - 10.1109/TAC.2019.2963293
DO - 10.1109/TAC.2019.2963293
M3 - Article
AN - SCOPUS:85095679030
SN - 0018-9286
VL - 65
SP - 4800
EP - 4807
JO - IEEE Transactions on Automatic Control
JF - IEEE Transactions on Automatic Control
IS - 11
M1 - 8946327
ER -