High-dimensional linear regression via implicit regularization

Peng Zhao, Yun Yang, Qiao Chu He

Research output: Contribution to journalArticlepeer-review

Abstract

Many statistical estimators for high-dimensional linear regression are $M$-estimators, formed through minimizing a data-dependent square loss function plus a regularizer. This work considers a new class of estimators implicitly defined through a discretized gradient dynamic system under overparameterization. We show that, under suitable restricted isometry conditions, overparameterization leads to implicit regularization: if we directly apply gradient descent to the residual sum of squares with sufficiently small initial values then, under some proper early stopping rule, the iterates converge to a nearly sparse rate-optimal solution that improves over explicitly regularized approaches. In particular, the resulting estimator does not suffer from extra bias due to explicit penalties, and can achieve the parametric root-n rate when the signal-to-noise ratio is sufficiently high. We also perform simulations to compare our methods with high-dimensional linear regression with explicit regularization. Our results illustrate the advantages of using implicit regularization via gradient descent after overparameterization in sparse vector estimation.

Original languageEnglish (US)
Pages (from-to)1033-1046
Number of pages14
JournalBiometrika
Volume109
Issue number4
DOIs
StatePublished - Dec 1 2022

Keywords

  • Early stopping
  • Gradient descent
  • High-dimensional regression
  • Implicit regularization
  • Overparameterization

ASJC Scopus subject areas

  • Statistics and Probability
  • General Mathematics
  • Agricultural and Biological Sciences (miscellaneous)
  • General Agricultural and Biological Sciences
  • Statistics, Probability and Uncertainty
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'High-dimensional linear regression via implicit regularization'. Together they form a unique fingerprint.

Cite this