Abstract
In this paper, we introduce a shrinkage-to-tapering approach for estimating large covariance matrices when the number of samples is substantially fewer than the number of variables (i.e., n,p → ∞ and p\n → ∞). The proposed estimator improves upon both shrinkage and tapering estimators by shrinking the sample covariance matrix to its tapered version. We first show that, under both normalized Frobenius and spectral risks, the minimum mean-squared error (MMSE) shrinkage-to-identity estimator is inconsistent and outperformed by a minimax tapering estimator for a class of high-dimensional and diagonally dominant covariance matrices. Motivated by this observation, we propose a shrinkage-to-tapering oracle (STO) estimator for efficient estimation of general, large covariance matrices. A closed-form formula of the optimal coefficient ρ of the proposed STO estimator is derived under the minimum Frobenius risk. Since the true covariance matrix is to be estimated, we further propose a STO approximating (STOA) algorithm with a data-driven bandwidth selection procedure to iteratively estimate the coefficient ρ and the covariance matrix. We study the finite sample performances of different estimators and our simulation results clearly show the improved performances of the proposed STO estimators. Finally, the proposed STOA method is applied to a real breast cancer gene expression data set.
Original language | English (US) |
---|---|
Article number | 6252067 |
Pages (from-to) | 5640-5656 |
Number of pages | 17 |
Journal | IEEE Transactions on Signal Processing |
Volume | 60 |
Issue number | 11 |
DOIs | |
State | Published - 2012 |
Externally published | Yes |
Keywords
- Large covariance estimation
- minimax risk
- minimum mean-squared errors
- shrinkage estimator
- tapering operator
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering