Abstract

Deviating multivariate observations are used typically to test the performance of outlier detection methods. Yet, the generation of outlying cases itself usually appears as a secondary methodological step in methods comparison. In the literature, outliers are defined using certain distribution parameters which differ from those of the clean or reference data. However, these parameters change among authors, leading to a lack of a standard and measurable definition of the characteristics simulated outliers. This makes the comparison between methods hard and its results dependent on the procedure followed to simulate the data. In order to set a standard procedure, a framework to simulate outliers is defined here. Since it is based on certain specifications for both the Squared Prediction Error (SPE) and Hotelling’s T2 statistics from a Principal Component Analysis (PCA) model, tuning them becomes a simple and efficient task. This procedure has been implemented in a set of Matlab functions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.