TY - GEN
T1 - Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion
AU - Zhang, Gavin
AU - Chiu, Hong Ming
AU - Zhang, Richard Y.
N1 - The authors thank Salar Fattahi for helpful discussions and feedback on an earlier draft. Financial support for this work was provided in part by the NSF CAREER Award ECCS-2047462 and in part by C3.ai Inc. and the Microsoft Corporation via the C3.ai Digital Transformation Institute.
PY - 2022
Y1 - 2022
N2 - The matrix completion problem seeks to recover a d × d ground truth matrix of low rank r ≪ d from observations of its individual elements. Real-world matrix completion is often a huge-scale optimization problem, with d so large that even the simplest full-dimension vector operations with O(d) time complexity become prohibitively expensive. Stochastic gradient descent (SGD) is one of the few algorithms capable of solving matrix completion on a huge scale, and can also naturally handle streaming data over an evolving ground truth. Unfortunately, SGD experiences a dramatic slow-down when the underlying ground truth is ill-conditioned; it requires at least O(κ log(1/ε)) iterations to get ε-close to ground truth matrix with condition number κ. In this paper, we propose a preconditioned version of SGD that preserves all the favorable practical qualities of SGD for huge-scale online optimization while also making it agnostic to κ. For a symmetric ground truth and the Root Mean Square Error (RMSE) loss, we prove that the preconditioned SGD converges to ε-accuracy in O(log(1/ε)) iterations, with a rapid linear convergence rate as if the ground truth were perfectly conditioned with κ = 1. In our experiments, we observe a similar acceleration for item-item collaborative filtering on the MovieLens25M dataset via a pair-wise ranking loss, with 100 million training pairs and 10 million testing pairs. [See supporting code at https://github.com/Hong-Ming/ScaledSGD.].
AB - The matrix completion problem seeks to recover a d × d ground truth matrix of low rank r ≪ d from observations of its individual elements. Real-world matrix completion is often a huge-scale optimization problem, with d so large that even the simplest full-dimension vector operations with O(d) time complexity become prohibitively expensive. Stochastic gradient descent (SGD) is one of the few algorithms capable of solving matrix completion on a huge scale, and can also naturally handle streaming data over an evolving ground truth. Unfortunately, SGD experiences a dramatic slow-down when the underlying ground truth is ill-conditioned; it requires at least O(κ log(1/ε)) iterations to get ε-close to ground truth matrix with condition number κ. In this paper, we propose a preconditioned version of SGD that preserves all the favorable practical qualities of SGD for huge-scale online optimization while also making it agnostic to κ. For a symmetric ground truth and the Root Mean Square Error (RMSE) loss, we prove that the preconditioned SGD converges to ε-accuracy in O(log(1/ε)) iterations, with a rapid linear convergence rate as if the ground truth were perfectly conditioned with κ = 1. In our experiments, we observe a similar acceleration for item-item collaborative filtering on the MovieLens25M dataset via a pair-wise ranking loss, with 100 million training pairs and 10 million testing pairs. [See supporting code at https://github.com/Hong-Ming/ScaledSGD.].
UR - http://www.scopus.com/inward/record.url?scp=85163215649&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85163215649&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85163215649
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
A2 - Koyejo, S.
A2 - Mohamed, S.
A2 - Agarwal, A.
A2 - Belgrave, D.
A2 - Cho, K.
A2 - Oh, A.
PB - Neural information processing systems foundation
T2 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
Y2 - 28 November 2022 through 9 December 2022
ER -