Abstract
In a number of recent articles Riani, Cerioli, Atkinson and others advocate the technique of monitoring robust estimates computed over a range of key parameter values. Through this approach the diagnostic tools of choice can be tuned in such a way that highly robust estimators which are as efficient as possible are obtained. This approach is applicable to various robust multivariate estimates like S- and MM-estimates, MVE and MCD as well as to the Forward Search in whichmonitoring is part of the robust method. Key tool for detection of multivariate outliers and for monitoring of robust estimates is the Mahalanobis distances and statistics related to these distances. However, the results obtained with thistool in case of compositional data might be unrealistic since compositional data contain relative rather than absolute information and need to be transformed to the usual Euclidean geometry before the standard statistical tools can be applied. Various data transformations of compositional data have been introduced in the literature and theoretical results on the equivalence of the additive, the centered, and the isometric logratio transformation in the context of outlier identification exist. To illustrate the problem of monitoring compositional data and to demonstrate the usefulness of monitoring in this case we start with a simple example and then analyze a real life data set presenting the technologicalstructure of manufactured exports. The analysis is conducted with the R package fsdaR, which makes the analytical and graphical tools provided in the MATLAB FSDA library available for R users.
Highlights
In many cases the data sets are characterized by multivariate observations containing relative contributions of parts to a whole
All the methods discussed in a number of papers on forward search and monitoring are implemented in the Flexible Statistics and Data Analysis (FSDA) toolbox (Riani, Perrotta, and Torti, 2012), freely available for users with a MATLAB license at hand from http://rosa.unipr.it
It does not matter if the robust methods are used directly to fit models or indirectly to identify outliers, some arbitrarily chosen parameters can have a destructive effect on the results
Summary
In many cases the data sets are characterized by multivariate observations (vectors) containing relative contributions of parts to a whole. To our knowledge, none of the studies on export structure consider them as compositional data the focus is on the structure of the exports and the absolute values of the exports are not relevant for the analysis Continuing further with this analogy we obtain in the case of MM-estimates monitoring plots of squared MDs against efficiency which is varied from 0.5 to 0.99 by a suitable step
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.