Abstract

Consider $n$ independent and identically distributed $p$-dimensional Gaussian random vectors with covariance matrix $\Sigma.$ The problem of estimating $\Sigma$ when $p$ is much larger than $n$ has received a lot of attention in recent years. Yet little is known about the information criterion for covariance matrix estimation. How to properly define such a criterion and what are the statistical properties? We attempt to answer these questions in the present paper by focusing on the estimation of bandable covariance matrices when $p>n$ but $\log(p)=o(n)$. Motivated by the deep connection between Stein's unbiased risk estimation (SURE) and AIC in regression models, we propose a family of generalized SURE ($\text{SURE}_c$) indexed by $c$ for covariance matrix estimation, where $c$ is some constant. When $c$ is 2, $\text{SURE}_2$ provides an unbiased estimator of the Frobenious risk of the covariance matrix estimator. Furthermore, we show that by minimizing $\text{SURE}_2$ over all possible banding covariance matrix estimators we attain the minimax optimal rate of convergence and the resulting estimator behaves like the covariance matrix estimator obtained by the so-called oracle tuning. On the other hand, we also show that $\text{SURE}_2$ is selection inconsistent when the true covariance matrix is exactly banded. To fix the selection inconsistency, we consider using SURE with $c=\log(n)$ and prove that by minimizing $\text{SURE}_{\log(n)}$ we select the true bandwith with probability tending to one. Therefore, our analysis indicates that $\text{SURE}_2$ and $\text{SURE}_{\log(n)}$ can be regarded as the AIC and BIC for large covariance matrix estimation, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call