Abstract

We would like to congratulate the authors for their innovative paper, containing many stimulating ideas. The authors propose an estimator for multivariate location and scatter, robust to both cellwise and casewise contamination. The idea is simple: (i) check for large univariate outliers and replace these cellwise outliers byNA, and (ii) apply the S-estimator for missing values of Danilov et al. (2012). If there are no cellwise outliers detected in step (i), the proposed estimator equals the regular S-estimator and shares the affine equivariance property. The authors will agree that the main power of the estimator comes from the second step, where the casewise—or multivariate—outliers are detected using a robust version of the Mahalanobis distance. For every observation, this distance is computed in the dimension given by the number of non-missing components. Danilov et al. (2012) present a smart way to compute an S-estimator associated with Mahalanobis distances computed in different dimensions. Estimation of the scatter matrix is ‘a corner stone in many applications’, as the authors state. However, the applications that the authors list (principal component analysis, factor analysis, and multiple linear regression) require the precision matrix Θ = Σ−1 rather than the covariance matrixΣ . Obviously, the inverse of the proposed two-step generalized S-estimator (TSGS) yields an estimate of the precision matrix. In this discussion note we (i) investigate the performance of TSGS as precision matrix estimator by means of a modest simulation study, (ii) discuss a regularized version of

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call