TY - GEN
T1 - Breakdown point of model selection when the number of variables exceeds the number of observations
AU - Donoho, David
AU - Stodden, Victoria
N1 - Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2006
Y1 - 2006
N2 - The classical multivariate linear regression problem assumes p variables X1, X2, . . ., Xp and a response vector y, each with n observations, and a linear relationship between the two: y = Xß + z, where z ∼ N(O,σ2). We point out that when p > n, there is a breakdown point for standard model selection schemes, such that model selection only works well below a certain critical complexity level depending on n/p. We apply this notion to some standard model selection algorithms (Forward Stepwise, LASSO, LARS) in the case where p ≫ n. We and that 1) the breakdown point is well-de ned for random X -models and low noise, 2) increasing noise shifts the breakdown point to lower levels of sparsity, and reduces the model recovery ability of the algorithm in a systematic way, and 3) below breakdown, the size of coefcient errors follows the theoretical error distribution for the classical linear model.
AB - The classical multivariate linear regression problem assumes p variables X1, X2, . . ., Xp and a response vector y, each with n observations, and a linear relationship between the two: y = Xß + z, where z ∼ N(O,σ2). We point out that when p > n, there is a breakdown point for standard model selection schemes, such that model selection only works well below a certain critical complexity level depending on n/p. We apply this notion to some standard model selection algorithms (Forward Stepwise, LASSO, LARS) in the case where p ≫ n. We and that 1) the breakdown point is well-de ned for random X -models and low noise, 2) increasing noise shifts the breakdown point to lower levels of sparsity, and reduces the model recovery ability of the algorithm in a systematic way, and 3) below breakdown, the size of coefcient errors follows the theoretical error distribution for the classical linear model.
UR - http://www.scopus.com/inward/record.url?scp=40649103930&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=40649103930&partnerID=8YFLogxK
U2 - 10.1109/ijcnn.2006.246934
DO - 10.1109/ijcnn.2006.246934
M3 - Conference contribution
AN - SCOPUS:40649103930
SN - 0780394909
SN - 9780780394902
T3 - IEEE International Conference on Neural Networks - Conference Proceedings
SP - 1916
EP - 1921
BT - International Joint Conference on Neural Networks 2006, IJCNN '06
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - International Joint Conference on Neural Networks 2006, IJCNN '06
Y2 - 16 July 2006 through 21 July 2006
ER -