TY - JOUR

T1 - Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning

AU - Srikant, R.

AU - Ying, Lei

N1 - Funding Information:
Acknowledgments: Research supported by the following grants: NSF NeTS 1718203, NSF CPS ECCS 1739189, NSF CPS ECCS 1739344, CMMI 1562276, NSF ECCS 1609370, NSF ECCS 1609202, NSF/USDA Grant AG 2018-67007-28379, NSF NeTS 1813392, and NSF SpecEES 1824393.
Publisher Copyright:
© 2019 R. Srikant & L. Ying.

PY - 2019

Y1 - 2019

N2 - We consider the dynamics of a linear stochastic approximation algorithm driven by Markovian noise, and derive finite-time bounds on the moments of the error, i.e., deviation of the output of the algorithm from the equilibrium point of an associated ordinary differential equation (ODE). We obtain finite-time bounds on the mean-square error in the case of constant step-size algorithms by considering the drift of an appropriately chosen Lyapunov function. The Lyapunov function is a standard Lyapunov function used to study the stability of linear ODEs, but can also be interpreted in terms of Stein’s method, which is used to obtain bounds on steady-state performance bounds. We also provide a comprehensive treatment of the moments of the square of the 2-norm of the approximation error. Our analysis yields the following results: (i) for a given step-size, we show that the lower-order moments can be made small as a function of the step-size and can be upper-bounded by the moments of a Gaussian random variable; (ii) we show that the higher-order moments beyond a threshold may be infinite in steady-state; and (iii) we characterize the number of samples needed for the finite-time bounds to be of the same order as the steady-state bounds. As a by-product of our analysis, we also solve the problem of obtaining finite-time bounds for the performance of temporal difference learning algorithms with linear function approximation and a constant step-size, without requiring a projection step or an i.i.d. noise assumption.

AB - We consider the dynamics of a linear stochastic approximation algorithm driven by Markovian noise, and derive finite-time bounds on the moments of the error, i.e., deviation of the output of the algorithm from the equilibrium point of an associated ordinary differential equation (ODE). We obtain finite-time bounds on the mean-square error in the case of constant step-size algorithms by considering the drift of an appropriately chosen Lyapunov function. The Lyapunov function is a standard Lyapunov function used to study the stability of linear ODEs, but can also be interpreted in terms of Stein’s method, which is used to obtain bounds on steady-state performance bounds. We also provide a comprehensive treatment of the moments of the square of the 2-norm of the approximation error. Our analysis yields the following results: (i) for a given step-size, we show that the lower-order moments can be made small as a function of the step-size and can be upper-bounded by the moments of a Gaussian random variable; (ii) we show that the higher-order moments beyond a threshold may be infinite in steady-state; and (iii) we characterize the number of samples needed for the finite-time bounds to be of the same order as the steady-state bounds. As a by-product of our analysis, we also solve the problem of obtaining finite-time bounds for the performance of temporal difference learning algorithms with linear function approximation and a constant step-size, without requiring a projection step or an i.i.d. noise assumption.

UR - http://www.scopus.com/inward/record.url?scp=85138488009&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85138488009&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85138488009

SN - 2640-3498

VL - 99

SP - 2803

EP - 2830

JO - Proceedings of Machine Learning Research

JF - Proceedings of Machine Learning Research

T2 - 32nd Conference on Learning Theory, COLT 2019

Y2 - 25 June 2019 through 28 June 2019

ER -