In multiple simultaneous hypothesis testing (MSHT), a significance thresholding function as a scalar statistic can be designed in an adaptive manner by sharing information among many tests performed simultaneously. By using such an adapted statistic, MSHT has greater detection power than tests using simple individual statistics. To systematically obtain an optimal thresholding function that maximizes the detection power in MSHT, Storey (2007) proposed a theoretical framework called the optimal discovery procedure (ODP). He also proposed an empirical estimation of the ODP thresholding function for a parametric MSHT that presupposes parametric forms of the null and alternative likelihood functions. Empirical Bayesian testing (Efron et al. 2001), which is based on a non-parametric treatment of arbitrary test statistics, has sometimes exhibited comparable power to the ODP. These two MSHT frameworks appear to be closely related but, because of differences in their approach (frequentist vs. Bayesian), the relationship is not well understood.We present the new concept of an optimal sufficient statistic that links the ODP and empirical Bayesian frameworks, and we show that the local false discovery rate based on the empirical Bayes can be an optimal thresholding function if a certain condition holds. We lay out exhaustive sets of presumptions to achieve optimal thresholding functions and show that, if an optimal thresholding function is derived for a parametric MSHT problem, it is still optimal for a more general and broader range of MSHT problems defined in a non- or semi-parametric way. A guide to designing optimal thresholding functions for general MSHT problems is thus provided by our study.
Read full abstract