On the landscape of one-hidden-layer sparse networks and beyond

Dachao Lin, Ruoyu Sun, Zhihua Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

Sparse neural networks have received increasing interest due to their small size compared to dense networks. Nevertheless, most existing works on neural network theory have focused on dense neural networks, and the understanding of sparse networks is very limited. In this paper we study the loss landscape of one-hidden-layer sparse networks. First, we consider sparse networks with a dense final layer. We show that linear networks can have no spurious valleys under special sparse structures, and non-linear networks could also admit no spurious valleys under a wide final layer. Second, we discover that spurious valleys and spurious minima can exist for wide sparse networks with a sparse final layer. This is different from wide dense networks which do not have spurious valleys under mild assumptions.

Original languageEnglish (US)
Article number103739
JournalArtificial Intelligence
Volume309
DOIs
StatePublished - Aug 2022

Keywords

  • Deep learning theory
  • Landscape
  • Optimization
  • Sparse neural networks

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'On the landscape of one-hidden-layer sparse networks and beyond'. Together they form a unique fingerprint.

Cite this