Abstract
Under rotation-equivariant decision theory, sample covariance matrix eigenvalues can be optimally shrunk by recombining sample eigenvectors with a (potentially nonlinear) function of the unobservable population covariance matrix. The optimal shape of this function reflects the loss/risk that is to be minimized. We solve the problem of optimal covariance matrix estimation under a variety of loss functions motivated by statistical precedent, probability theory, and differential geometry. A key ingredient of our nonlinear shrinkage methodology is a new estimator of the angle between sample and population eigenvectors, without making strong assumptions on the population eigenvalues. We also introduce a broad family of covariance matrix estimators that can handle all regular functional transformations of the population covariance matrix under large-dimensional asymptotics. In addition, we compare via Monte Carlo simulations our methodology to two simpler ones from the literature, linear shrinkage and shrinkage based on the spiked covariance model.
Highlights
Ever since Stein (1956) proved that the usual estimator of the mean is inadmissible in dimensions greater than three, decision theory has taken the edge over likelihood maximization in multivariate statistics
In order to achieve this degree of generality, we identify a formula from random matrix theory (RMT) that enables us to estimate the angle of any sample eigenvector with any population eigenvector, in the large-dimensional asymptotic limit
We have to bear in mind that we operate under large-dimensional asymptotics, so all quantities need to be normalized by the ever-increasing matrix dimension p in appropriate fashion
Summary
Ever since Stein (1956) proved that the usual estimator of the mean is inadmissible in dimensions greater than three, decision theory has taken the edge over likelihood maximization in multivariate statistics. We place ourselves firmly within the paradigm pioneered by Stein (1975, 1986): (i) no assumption on the eigenvalues of the population covariance matrix apart from positive definiteness; (ii) equivariance with respect to rotation of the original orthonormal basis of variables; and (iii) full flexibility to modify the eigenvalues of the sample covariance matrix as deemed necessary This is a tall order, and even Stein’s finite-sample mathematical prowesses achieved limited progress. The spiked covariance model of Johnstone (2001) has been used by Donoho et al (2018) to derive shrinkage covariance matrix estimators for a menagerie of 26 different loss functions They promote the spiked model because, as they state in their Section 10, the simple shrinkage rules we propose here may be more likely to be applied correctly in practice, and to work as expected, even in relatively small sample sizes. An appendix collects various mathematical results to keep the presentation in the main paper compact
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have