Abstract
We study the accuracy of estimating the covariance and the precision matrix of a $D$-variate sub-Gaussian distribution along a prescribed subspace or direction using the finite sample covariance. Our results show that the estimation accuracy depends almost exclusively on the components of the distribution that correspond to desired subspaces or directions. This is relevant and important for problems where the behavior of data along a lower-dimensional space is of specific interest, such as dimension reduction or structured regression problems. We also show that estimation of precision matrices is almost independent of the condition number of the covariance matrix. The presented applications include direction-sensitive eigenspace perturbation bounds, relative bounds for the smallest eigenvalue, and the estimation of the single-index model. For the latter, a new estimator, derived from the analysis, with strong theoretical guarantees and superior numerical performance is proposed.
Highlights
Estimating the covariance Σ = E(X − EX)(X − EX) and the precision matrix Σ† of a random vector X ∈ RD is a standard and long standing problem in multivariate statistics with applications in a number of mathematical and applied fields
Notable examples include any form of dimension reduction, such as principal component analysis, nonlinear dimension reduction, manifold learning, and problems ranging from classification, regression, and signal processing to econometrics, brain imaging and social networks
Bounds developed in this work have a few immediate corollaries, which might be of independent interest. These include eigenspace perturbation bounds similar to [59, Theorem 1], but which are sensitive to the behavior of X in the direction corresponding to the eigenspace of interest, and a relative bound for the smallest eigenvalue of Σcomparable to [58, Theorem 2.2], but without the isotropicity assumption
Summary
Many modern data analysis tasks explicitly rely on anisotropic distributions because different spectral modalities of the covariance matrix provide crucial, and complementary, information about the task at hand In this case, using norm submultiplicativity and standard bounds for Σ − Σ and Σ † − Σ† overestimates incurred errors because it decouples A and B from their effect on covariance and precision matrices. A typical example that leverages different modalities of (conditional) covariance matrices are problems that analyze the structure of point clouds, such as manifold learning. This is because such methods are often prefaced by a linearization step, where the globally non-linear geometry is locally approximated by tangential spaces. These include eigenspace perturbation bounds similar to [59, Theorem 1], but which are sensitive to the behavior of X in the direction corresponding to the eigenspace of interest, and a relative bound for the smallest eigenvalue of Σcomparable to [58, Theorem 2.2], but without the isotropicity assumption
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have