Detection of multivariate outliers in business survey data with incomplete information

Valentin Todorov,Peter Filzmoser,Matthias Templ

doi:10.1007/s11634-010-0075-2

Abstract

Many different methods for statistical data editing can be found in the literature but only few of them are based on robust estimates (for example such as BACON-EEM, epidemic algorithms (EA) and transformed rank correlation (TRC) methods of Beguin and Hulliger). However, we can show that outlier detection is only reasonable if robust methods are applied, because the classical estimates are themselves influenced by the outliers. Nevertheless, data editing is essential to check the multivariate data for possible data problems and it is not deterministic like the traditional micro editing where all records are extensively edited manually using certain rules/constraints. The presence of missing values is more a rule than an exception in business surveys and poses additional severe challenges to the outlier detection. First we review the available multivariate outlier detection methods which can cope with incomplete data. In a simulation study, where a subset of the Austrian Structural Business Statistics is simulated, we compare several approaches. Robust methods based on the Minimum Covariance Determinant (MCD) estimator, S-estimators and OGK-estimator as well as BACON-BEM provide the best results in finding the outliers and in providing a low false discovery rate. Many of the discussed methods are implemented in the R package $${\tt{rrcovNA}}$$ which is available from the Comprehensive R Archive Network (CRAN) at http://www.CRAN.R-project.org under the GNU General Public License.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Detection of multivariate outliers in business survey data with incomplete information

Abstract

Talk to us

Similar Papers

More From: Advances in Data Analysis and Classification

Lead the way for us

Journal: Advances in Data Analysis and Classification	Publication Date: Oct 27, 2010
Citations: 73

Similar Papers

Threshold Effects on Outlier Detection: A Comparative Study of MCD and MRCD Estimators in Multivariate Data Analysis
Nafisat Yusuf ... Bannister Jerry Zachary
Asian Journal of Probability and Statistics | VOL. 25
Nafisat Yusuf, et. al.Nafisat Yusuf ... Bannister Jerry Zachary
04 Nov 2023
Asian Journal of Probability and Statistics | VOL. 25

Mahalanobis distance based on minimum regularized covariance determinant estimators for high dimensional data
Hasan Bulut
Communications in Statistics - Theory and Methods | VOL. 49
Hasan BulutHasan Bulut
29 Jan 2020
Communications in Statistics - Theory and Methods | VOL. 49

Application of multivariate outlier detection to fluid velocity measurements
John Griffin ... Lawrence S Ukeiley
Experiments in Fluids | VOL. 49
John Griffin, et. al.John Griffin ... Lawrence S Ukeiley
14 Apr 2010
Experiments in Fluids | VOL. 49

Robust Detection of Multivariate Outliers in Asset Returns and Risk Factors Data
... R Douglas Martin
SSRN Electronic Journal | VOL. -
, et. al. ... R Douglas Martin
02 Oct 2017
SSRN Electronic Journal | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detection of multivariate outliers in business survey data with incomplete information

Abstract

Talk to us

Similar Papers

More From: Advances in Data Analysis and Classification