False Discovery Versus Familywise Error Rate Approaches to Outlier Detection

Yihuan Xu,Boris Iglewicz

doi:10.1080/19466315.2015.1119720

Abstract

Outliers, in general, are observations that deviate sufficiently from a base distribution. This study deals with outlier detection approaches for large samples from continuous univariate distributions. Investigated are the properties of a practical newer outlier detection approach based on use of a false discovery rate (FDR) method in conjunction with a robustly estimated Tukey g-and-h base distribution. Compared are the properties of a boxplot type outlier detection approach that controls the familywise error rate (FWER) with a newer FDR approach. These options are compared in terms of error rates and effects of moving the outliers gradually further from base distribution center while using 5% and 1% FDR or FWERs. Two microarray datasets are used as examples where the assumed null distributions do not fit the data well. In such cases, the proposed estimated Tukey g-and-h null distribution approach leads to superior outlier detection performance. Supplementary materials for this article are available online.

Full Text