TY - JOUR
T1 - A unified switching system perspective and convergence analysis of Q-learning algorithms
AU - Lee, Donghwan
AU - He, Niao
N1 - Funding Information:
We thank the reviewers and area chair for constructive feedback. We would like to thank Csaba Szepesvari, Bin Hu, and Rohit Gupta for insightful comments. The work was supported by NSF CRII 1755829 and NSF CCF 1934986.
Publisher Copyright:
© 2020 Neural information processing systems foundation. All rights reserved.
PY - 2020
Y1 - 2020
N2 - This paper develops a novel and unified framework to analyze the convergence of a large family of Q-learning algorithms from the switching system perspective. We show that the nonlinear ODE models associated with Q-learning and many of its variants can be naturally formulated as affine switching systems. Building on their asymptotic stability, we obtain a number of interesting results: (i) we provide a simple ODE analysis for the convergence of asynchronous Q-learning under relatively weak assumptions; (ii) we establish the first convergence analysis of the averaging Q-learning algorithm, and (iii) we derive a new sufficient condition for the convergence of Q-learning with linear function approximation.
AB - This paper develops a novel and unified framework to analyze the convergence of a large family of Q-learning algorithms from the switching system perspective. We show that the nonlinear ODE models associated with Q-learning and many of its variants can be naturally formulated as affine switching systems. Building on their asymptotic stability, we obtain a number of interesting results: (i) we provide a simple ODE analysis for the convergence of asynchronous Q-learning under relatively weak assumptions; (ii) we establish the first convergence analysis of the averaging Q-learning algorithm, and (iii) we derive a new sufficient condition for the convergence of Q-learning with linear function approximation.
UR - http://www.scopus.com/inward/record.url?scp=85102145972&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102145972&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85102145972
SN - 1049-5258
VL - 2020-December
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 34th Conference on Neural Information Processing Systems, NeurIPS 2020
Y2 - 6 December 2020 through 12 December 2020
ER -