Abstract

A statistical hypothesis test is one of the most eminent methods in statistics. Its pivotal role comes from the wide range of practical problems it can be applied to and the sparsity of data requirements. Being an unsupervised method makes it very flexible in adapting to real-world situations. The availability of high-dimensional data makes it necessary to apply such statistical hypothesis tests simultaneously to the test statistics of the underlying covariates. However, if applied without correction this leads to an inevitable increase in Type 1 errors. To counteract this effect, multiple testing procedures have been introduced to control various types of errors, most notably the Type 1 error. In this paper, we review modern multiple testing procedures for controlling either the family-wise error (FWER) or the false-discovery rate (FDR). We emphasize their principal approach allowing categorization of them as (1) single-step vs. stepwise approaches, (2) adaptive vs. non-adaptive approaches, and (3) marginal vs. joint multiple testing procedures. We place a particular focus on procedures that can deal with data with a (strong) correlation structure because real-world data are rarely uncorrelated. Furthermore, we also provide background information making the often technically intricate methods accessible for interdisciplinary data scientists.

Highlights

  • We are living in a data-rich era in which every field of science or industry generates data seemingly in an effortless manner [1,2]

  • One method that is of central importance in this field is statistical hypothesis testing [9]

  • Statistical hypothesis testing is an unsupervised learning method comparing a null hypothesis with an alternative hypothesis to make a quantitative decision selecting one of these

Read more

Summary

Introduction

We are living in a data-rich era in which every field of science or industry generates data seemingly in an effortless manner [1,2]. We discuss such methods called multiple testing procedures (MTPs) (or multiple testing corrections (MTCs) or multiple comparisons (MCs)) [10,11,12] for controlling either the family-wise error (FWER) or the false-discovery rate (FDR). We emphasize their principal approach allowing categorization of them as (1) single-step vs stepwise approaches, (2) adaptive vs non-adaptive approaches, and (3) marginal vs joint MTPs. For the model assessment of multiply tested hypothesis there are many error measures that can be used focusing on either the Type 1 errors or the Type 2 errors.

Preliminaries
Formal Setting
Simulations in R
Focus on Pairwise Correlations
Focus on a Network Correlation Structure
Application of Multiple Testing Procedures
Motivation of the Problem
Theoretical Considerations
Experimental Example
Types of Multiple Testing Procedures
Controlling the FWER
Šidák Correction
Bonferroni Correction
Holm Correction
Hochberg Correction
Hommel Correction
Examples
Westfall-Young Procedure
Controlling the FDR
Benjamini-Hochberg Procedure
Example
Adaptive Benjamini-Hochberg Procedure
Benjamini-Yekutieli Procedure
Benjamini-Krieger-Yekutieli Procedure
BR-1S Procedure
BR-2S Procedure
Computational Complexity
Method
Summary
Marginal or joint distribution
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call