Abstract
Agostinelli, Leung, Yohai and Zamar (Agostinelli et al. in the remainder) consider the difficult problem of robust estimation based on high-dimensional data. If outlying values can appear independently in the variables, then it can easily occur that the majority of the observations in high-dimensional data is contaminated, as pointed out in Alqallaf et al (2009). Consequently, standard robust methods fail in this case, and new methods need to be developed that can handle this type of contamination. Moreover, next to independent contamination also casewise or structural outliers can still appear in the data. This situation was formalized as the partially spoiled independent contamination model (PSICM) in Alqallaf et al (2009). In their paper Agostinelli et al. are the first to introduce a consistent estimator of multivariate location and scatter that is highly robust against both cellwise and casewise outliers. The 2SGS is a strongly consistent estimator of the location and shape of general elliptical distributions. Similarly to other proposals, the estimator proceeds in two steps. In the first step an outlier detection rule is used to identify potential cellwise outliers. A first improvement is the use of a data adaptive cutoff instead of a fixed cutoff value when filtering cellwise outliers. The second novelty is to replace flagged outliers by missing values as first proposed in Danilov (2010) and Farcomeni (2013), while earlier proposals tried to reduce their effect through some form of Winsorization, see e.g. Alqallaf et al. (2002), Van Aelst et al. (2011,2012), Van Aelst (2015). In the second step, the location and scatter are estimated based on the data set with missing values by using the GSE estimator of Danilov et al (2012). GSE
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.