Abstract
\bfA \bfb \bfs \bft \bfr \bfa \bfc \bft . Traditional landscape analysis of deep neural networks aims to show that no suboptimal local minima exist in some appropriate sense. From this, one may be tempted to conclude that descent algorithms which escape saddle points will reach a good local minimum. However, basic optimization theory tell us that it is also possible for a descent algorithm to diverge to infinity if there are paths leading to infinity, along which the loss function decreases. It is not clear whether for nonlinear neural networks there exists one setting such that ``no bad local-min"" and ``no decreasing paths to infinity"" can be simultaneously achieved. In this paper, we give the first positive answer to this question. More specifically, for a large class of overparameterized deep neural networks with appropriate regularizers, the loss function has no bad local minima and no decreasing paths to infinity. The key mathematical trick is to show that the set of regularizers, which may be undesirable, can be viewed as the image of a Lipschitz continuous mapping from a lower-dimensional Euclidean space to a higher-dimensional Euclidean space and thus has zero measure.
Original language | English (US) |
---|---|
Pages (from-to) | 2797-2827 |
Number of pages | 31 |
Journal | SIAM Journal on Optimization |
Volume | 32 |
Issue number | 4 |
DOIs | |
State | Published - Dec 2022 |
Keywords
- \bfK \bfe \bfy \bfw \bfo \bfr \bfd \bfs . landscape
- decreasing paths to infinity
- deep neural network
- local minimum
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Applied Mathematics