Abstract

We consider the problem of outliers in incomplete multivariate data when the aim is to estimate a measure of mean and covariance, as is the case, for example, in factor analysis. The ER algorithm of Little and Smith which combines the EM algorithm for missing data and a robust estimation step based on an M-estimator could be used in such a situation. However, the ER algorithm as originally proposed can fail to be robust in some cases, especially in high dimensions. We propose here two alternatives to avoid the problem. One is to combine a small modification of the ER algorithm with a so-called high-breakdown estimator as the starting point for the iterative procedure, and the other is to base the estimation step of the ER algorithm on a high-breakdown estimator. Among the high-breakdown estimators which are actually built to keep their robustness properties even if the number of variables is relatively large, we consider here the minimum covariance determinant estimator and the t-biweight S-estimator. Simulated and real data are used to compare and illustrate the different procedures.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.