TY - JOUR
T1 - Online Adaptive Policy Selection in Time-Varying Systems
T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
AU - Lin, Yiheng
AU - Preiss, James A.
AU - Anand, Emile
AU - Li, Yingying
AU - Yue, Yisong
AU - Wierman, Adam
N1 - This work is supported by NSF Grants CNS-2146814, CPS-2136197, CNS-2106403, NGSDI-2105648, CCF-1918865, and Gift from Latitude AI, with additional support for Yiheng Lin provided by Amazon AI4Science Fellowship and PIMCO Graduate Fellowship in Data Science.
PY - 2023
Y1 - 2023
N2 - We study online adaptive policy selection in systems with time-varying costs and dynamics. We develop the Gradient-based Adaptive Policy Selection (GAPS) algorithm together with a general analytical framework for online policy selection via online optimization. Under our proposed notion of contractive policy classes, we show that GAPS approximates the behavior of an ideal online gradient descent algorithm on the policy parameters while requiring less information and computation. When convexity holds, our algorithm is the first to achieve optimal policy regret. When convexity does not hold, we provide the first local regret bound for online policy selection. Our numerical experiments show that GAPS can adapt to changing environments more quickly than existing benchmarks.
AB - We study online adaptive policy selection in systems with time-varying costs and dynamics. We develop the Gradient-based Adaptive Policy Selection (GAPS) algorithm together with a general analytical framework for online policy selection via online optimization. Under our proposed notion of contractive policy classes, we show that GAPS approximates the behavior of an ideal online gradient descent algorithm on the policy parameters while requiring less information and computation. When convexity holds, our algorithm is the first to achieve optimal policy regret. When convexity does not hold, we provide the first local regret bound for online policy selection. Our numerical experiments show that GAPS can adapt to changing environments more quickly than existing benchmarks.
UR - http://www.scopus.com/inward/record.url?scp=85192279760&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85192279760&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85192279760
SN - 1049-5258
VL - 36
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
Y2 - 10 December 2023 through 16 December 2023
ER -