TY - JOUR
T1 - Improved global guarantees for the nonconvex Burer–Monteiro factorization via rank overparameterization
AU - Zhang, Richard Y.
N1 - Financial support for this work was provided in part by the NSF CAREER Award ECCS-2047462 and in part by C3.ai Inc. and the Microsoft Corporation via the C3.ai Digital Transformation Institute.
PY - 2024
Y1 - 2024
N2 - We consider minimizing a twice-differentiable, L-smooth, and μ-strongly convex objective ϕ over an n×n positive semidefinite matrix M⪰0, under the assumption that the minimizer M⋆ has low rank r⋆≪n. Following the Burer–Monteiro approach, we instead minimize the nonconvex objective f(X)=ϕ(XXT) over a factor matrix X of size n×r. This substantially reduces the number of variables from O(n2) to as few as O(n) and also enforces positive semidefiniteness for free, but at the cost of giving up the convexity of the original problem. In this paper, we prove that if the search rank r≥r⋆ is overparameterized by a constant factor with respect to the true rank r⋆, namely as in r>14(L/μ-1)2r⋆, then despite nonconvexity, local optimization is guaranteed to globally converge from any initial point to the global optimum. This significantly improves upon a previous rank overparameterization threshold of r≥n, which we show is sharp in the absence of smoothness and strong convexity, but would increase the number of variables back up to O(n2). Conversely, without rank overparameterization, we prove that such a global guarantee is possible if and only if ϕ is almost perfectly conditioned, with a condition number of L/μ<3. Therefore, we conclude that a small amount of overparameterization can lead to large improvements in theoretical guarantees for the nonconvex Burer–Monteiro factorization.
AB - We consider minimizing a twice-differentiable, L-smooth, and μ-strongly convex objective ϕ over an n×n positive semidefinite matrix M⪰0, under the assumption that the minimizer M⋆ has low rank r⋆≪n. Following the Burer–Monteiro approach, we instead minimize the nonconvex objective f(X)=ϕ(XXT) over a factor matrix X of size n×r. This substantially reduces the number of variables from O(n2) to as few as O(n) and also enforces positive semidefiniteness for free, but at the cost of giving up the convexity of the original problem. In this paper, we prove that if the search rank r≥r⋆ is overparameterized by a constant factor with respect to the true rank r⋆, namely as in r>14(L/μ-1)2r⋆, then despite nonconvexity, local optimization is guaranteed to globally converge from any initial point to the global optimum. This significantly improves upon a previous rank overparameterization threshold of r≥n, which we show is sharp in the absence of smoothness and strong convexity, but would increase the number of variables back up to O(n2). Conversely, without rank overparameterization, we prove that such a global guarantee is possible if and only if ϕ is almost perfectly conditioned, with a condition number of L/μ<3. Therefore, we conclude that a small amount of overparameterization can lead to large improvements in theoretical guarantees for the nonconvex Burer–Monteiro factorization.
KW - 90C22
KW - 90C26
KW - 90C30
KW - 90C46
UR - http://www.scopus.com/inward/record.url?scp=85209741774&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85209741774&partnerID=8YFLogxK
U2 - 10.1007/s10107-024-02160-6
DO - 10.1007/s10107-024-02160-6
M3 - Article
AN - SCOPUS:85209741774
SN - 0025-5610
JO - Mathematical Programming
JF - Mathematical Programming
ER -