Self-Regularity of Output Weights for Overparameterized Two-Layer Neural Networks

David Gamarnik, Eren C. Kizildag, Ilias Zadik

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We consider the problem of finding a two-layer neural network with sigmoid, rectified linear unit, or binary step activation functions that 'fits' a training data set as accurately as possible as quantified by the training error; and study the following question: does a low training error guarantee that the norm of the output layer (outer norm) itself is small? We address this question for the case of non-negative output weights. Using a simple covering number argument, we establish that under quite mild distributional assumptions on the input/label pairs; any such network achieving a small training error on polynomially many data necessarily has a well-controlled outer norm. Notably, our results (a) have a good sample complexity, (b) are independent of the number of hidden units, (c) are oblivious to the training algorithm; and (d) require quite mild assumptions on the data (in particular the input vector X Rd need not have independent coordinates). We then show how our bounds can be leveraged to yield generalization guarantees for such networks.

Original languageEnglish (US)
Title of host publication2021 IEEE International Symposium on Information Theory, ISIT 2021 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages819-824
Number of pages6
ISBN (Electronic)9781538682098
DOIs
StatePublished - Jul 12 2021
Externally publishedYes
Event2021 IEEE International Symposium on Information Theory, ISIT 2021 - Virtual, Melbourne, Australia
Duration: Jul 12 2021Jul 20 2021

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
Volume2021-July
ISSN (Print)2157-8095

Conference

Conference2021 IEEE International Symposium on Information Theory, ISIT 2021
Country/TerritoryAustralia
CityVirtual, Melbourne
Period7/12/217/20/21

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Self-Regularity of Output Weights for Overparameterized Two-Layer Neural Networks'. Together they form a unique fingerprint.

Cite this