Mean field analysis of neural networks: A central limit theorem

Justin Sirignano, Konstantinos Spiliopoulos

Research output: Contribution to journalArticlepeer-review

Abstract

We rigorously prove a central limit theorem for neural network models with a single hidden layer. The central limit theorem is proven in the asymptotic regime of simultaneously (A) large numbers of hidden units and (B) large numbers of stochastic gradient descent training iterations. Our result describes the neural network's fluctuations around its mean-field limit. The fluctuations have a Gaussian distribution and satisfy a stochastic partial differential equation. The proof relies upon weak convergence methods from stochastic analysis. In particular, we prove relative compactness for the sequence of processes and uniqueness of the limiting process in a suitable Sobolev space.

Original languageEnglish (US)
Pages (from-to)1820-1852
Number of pages33
JournalStochastic Processes and their Applications
Volume130
Issue number3
DOIs
StatePublished - Mar 2020

ASJC Scopus subject areas

  • Statistics and Probability
  • Modeling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Mean field analysis of neural networks: A central limit theorem'. Together they form a unique fingerprint.

Cite this