Abstract
We propose new nonparametric empirical Bayes methods for high-dimensional classification. Our classifiers are designed to approximate the Bayes classifier in a hypothesized hierarchical model, where the prior distributions for the model parameters are estimated nonparametrically from the training data. As is common with nonparametric empirical Bayes, the proposed classifiers are effective in high-dimensional settings even when the underlying model parameters are in fact nonrandom. We use nonparametric maximum likelihood estimates of the prior distributions, following the elegant approach studied by Kiefer & Wolfowitz in the 1950s. However, our implementation is based on a recent convex optimization framework for approximating these estimates that is well-suited for large-scale problems. We derive new theoretical results on the accuracy of the approximate estimator, which help control the misclassification rate of one of our classifiers. We show that our methods outperform several existing methods in simulations and perform well when gene expression microarray data is used to classify cancer patients.
Original language | English (US) |
---|---|
Pages (from-to) | 21-34 |
Number of pages | 14 |
Journal | Biometrika |
Volume | 103 |
Issue number | 1 |
DOIs | |
State | Published - Jan 1 2015 |
Keywords
- Classification
- Convex optimization
- Empirical Bayes estimation
- Kiefer-Wolfowitz estimator
- Mixture models
- Nonparametric maximum likelihood estimation
ASJC Scopus subject areas
- Statistics and Probability
- General Mathematics
- Agricultural and Biological Sciences (miscellaneous)
- General Agricultural and Biological Sciences
- Statistics, Probability and Uncertainty
- Applied Mathematics