Abstract
We consider the computational and statistical issues for high-dimensional Bayesian model selection under the Gaussian spike and slab priors. To avoid large matrix computations needed in a standard Gibbs sampler, we propose a novel Gibbs sampler called “Skinny Gibbs” which is much more scalable to high-dimensional problems, both in memory and in computational efficiency. In particular, its computational complexity grows only linearly in p, the number of predictors, while retaining the property of strong model selection consistency even when p is much greater than the sample size n. The present article focuses on logistic regression due to its broad applicability as a representative member of the generalized linear models. We compare our proposed method with several leading variable selection methods through a simulation study to show that Skinny Gibbs has a strong performance as indicated by our theoretical work. Supplementary materials for this article are available online.
Original language | English (US) |
---|---|
Pages (from-to) | 1205-1217 |
Number of pages | 13 |
Journal | Journal of the American Statistical Association |
Volume | 114 |
Issue number | 527 |
DOIs | |
State | Published - Jul 3 2019 |
Keywords
- Bayesian computation
- Gibbs sampling
- High-dimensional data
- Logistic regression
- Scalable computation
- Variable selection
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty