TY - JOUR
T1 - On the landscape of one-hidden-layer sparse networks and beyond
AU - Lin, Dachao
AU - Sun, Ruoyu
AU - Zhang, Zhihua
N1 - Funding Information:
Lin and Zhang have been supported by the National Key Research and Development Project of China (No. 2018AAA0101004 ).
Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2022/8
Y1 - 2022/8
N2 - Sparse neural networks have received increasing interest due to their small size compared to dense networks. Nevertheless, most existing works on neural network theory have focused on dense neural networks, and the understanding of sparse networks is very limited. In this paper we study the loss landscape of one-hidden-layer sparse networks. First, we consider sparse networks with a dense final layer. We show that linear networks can have no spurious valleys under special sparse structures, and non-linear networks could also admit no spurious valleys under a wide final layer. Second, we discover that spurious valleys and spurious minima can exist for wide sparse networks with a sparse final layer. This is different from wide dense networks which do not have spurious valleys under mild assumptions.
AB - Sparse neural networks have received increasing interest due to their small size compared to dense networks. Nevertheless, most existing works on neural network theory have focused on dense neural networks, and the understanding of sparse networks is very limited. In this paper we study the loss landscape of one-hidden-layer sparse networks. First, we consider sparse networks with a dense final layer. We show that linear networks can have no spurious valleys under special sparse structures, and non-linear networks could also admit no spurious valleys under a wide final layer. Second, we discover that spurious valleys and spurious minima can exist for wide sparse networks with a sparse final layer. This is different from wide dense networks which do not have spurious valleys under mild assumptions.
KW - Deep learning theory
KW - Landscape
KW - Optimization
KW - Sparse neural networks
UR - http://www.scopus.com/inward/record.url?scp=85130178016&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85130178016&partnerID=8YFLogxK
U2 - 10.1016/j.artint.2022.103739
DO - 10.1016/j.artint.2022.103739
M3 - Article
AN - SCOPUS:85130178016
SN - 0004-3702
VL - 309
JO - Artificial Intelligence
JF - Artificial Intelligence
M1 - 103739
ER -