Abstract
We consider the problem of estimating high-dimensional covariance matrices of $K$-populations or classes in the setting where the sample sizes are comparable to the data dimension. We propose estimating each class covariance matrix as a distinct linear combination of all class sample covariance matrices. This approach is shown to reduce the estimation error when the sample sizes are limited, and the true class covariance matrices share a somewhat similar structure. We develop an effective method for estimating the coefficients in the linear combination that minimize the mean squared error under the general assumption that the samples are drawn from (unspecified) elliptically symmetric distributions possessing finite fourth-order moments. To this end, we utilize the spatial sign covariance matrix, which we show (under rather general conditions) to be an asymptotically unbiased estimator of the normalized covariance matrix as the dimension grows to infinity. We also show how the proposed method can be used in choosing the regularization parameters for multiple target matrices in a single class covariance matrix estimation problem. We assess the proposed method via numerical simulation studies including an application in global minimum variance portfolio optimization using real stock data.
Highlights
H IGH-DIMENSIONAL covariance matrix estimation is a challenging problem as the dimension p of the observations can be much larger than the sample size n
We propose to estimate each class covariance matrix as a nonnegative linear combination of the sample covariance matrix (SCM) of all classes
We use the spatial sign covariance matrix (SSCM), which we show under rather general assumptions to be asymptotically unbiased with respect to growing dimension
Summary
H IGH-DIMENSIONAL covariance matrix estimation is a challenging problem as the dimension p of the observations can be much larger than the sample size n. At least some of the K population covariance matrices can be similar (close to each other in terms of suitable distance metric) and so it would be beneficial to use regularization to reduce the variance of the final estimates of the covariance matrices Following this idea, we propose to estimate each class covariance matrix as a nonnegative linear combination of the SCMs of all classes. We propose covariance matrix estimators for multiclass problems, based on linearly pooling the class SCMs. Several aspects and properties of the estimator are discussed including possible modifications and an extension for complex-valued data. We show how our proposed method can be used as a multi-target shrinkage covariance matrix estimator in a single class problem with arbitrary positive semidefinite target matrices.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have