Abstract
We study the Bayesian approach to variable selection for linear regression models. Motivated by a recent work by Roˇcková and George (2014), we propose an EM algorithm that returns the MAP estimator of the set of relevant variables. Due to its particular updating scheme, our algorithm can be implemented efficiently without inverting a large matrix in each iteration and therefore can scale up with big data. We also have showed that the MAP estimator returned by our EM algorithm achieves variable selection consistency even when p diverges with n. In practice, our algorithm could get stuck with local modes, a common problem with EM algorithms. To address this issue, we propose an ensemble EM algorithm, in which we repeatedly apply our EM algorithm to a subset of the samples with a subset of the covariates, and then aggregate the variable selection results across those bootstrap replicates. Empirical studies have demonstrated the superior performance of the ensemble EM algorithm.
Original language | English (US) |
---|---|
Pages (from-to) | 879-900 |
Number of pages | 22 |
Journal | Bayesian Analysis |
Volume | 17 |
Issue number | 3 |
DOIs | |
State | Published - Sep 2022 |
Keywords
- Bayesian bootstrap
- Bayesian variable selection
- EM
- asymptotic consistency
ASJC Scopus subject areas
- Statistics and Probability
- Applied Mathematics