Abstract
The well-known spatial sign covariance matrix (SSCM) carries out a radial transform which moves all data points to a sphere, followed by computing the classical covariance matrix of the transformed data. Its popularity stems from its robustness to outliers, fast computation, and applications to correlation and principal component analysis. In this paper we study more general radial functions. It is shown that the eigenvectors of the generalized SSCM are still consistent and the ranks of the eigenvalues are preserved. The influence function of the resulting scatter matrix is derived, and it is shown that its asymptotic breakdown value is as high as that of the original SSCM. A simulation study indicates that the best results are obtained when the inner half of the data points are not transformed and points lying far away are moved to the center.
Highlights
Robust estimation of the covariance matrix is an important and challenging problem
In order to improve its robustness against a substantial fraction of outliers we propose to use the k-step least trimmed squares (LTS) estimator
The KullbackLeibler divergence (KLdiv) plots on the left indicate that the spatial sign covariance matrix (SSCM) performs poorly for constant and linear eigenvalues, and looks better for quadratic eigenvalues but not when γ is large
Summary
Robust estimation of the covariance (scatter) matrix is an important and challenging problem. The most well-known orthogonally equivariant scatter estimator is the spatial sign covariance matrix (SSCM) proposed independently by Marden (1999) and Visuri et al (2000) and studied in more detail by Magyar and Tyler (2014) and Durre et al (2014, 2016) among others. The estimator computes the regular covariance matrix on the spatial signs of the data, which are the projections of the location-centered datapoints on the unit sphere. The SSCM is well-suited for this task as it is very fast and highly robust against outlying observations and often yields a reliable starting value Another application of the SSCM is to testing for sphericity (Sirkia et al, 2009), which uses the asymptotic properties of the SSCM in order to assess whether the underlying distribution of the data deviates substantially from a spherical distribution.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have