Abstract

In this paper, we consider the problem of multiple testing where the hypotheses are dependent. In most of the existing literature, either Bayesian or non-Bayesian, the decision rules mainly focus on the validity of the test procedure rather than actually utilizing the dependency to increase efficiency. Moreover, the decisions regarding different hypotheses are marginal in the sense that they do not depend upon each other directly. However, in realistic situations, the hypotheses are usually dependent, and hence it is desirable that the decisions regarding the dependent hypotheses are taken jointly. In this article, we develop a novel Bayesian multiple testing procedure that coherently takes this requirement into consideration. Our method, which is based on new notions of error and non-error terms, substantially enhances efficiency by judicious exploitation of the dependence structure among the hypotheses. We show that our method minimizes the posterior expected loss associated with an additive “0-1” loss function; we also prove theoretical results on the relevant error probabilities, establishing the coherence and usefulness of our method. The optimal decision configuration is not available in closed form and we propose an efficient simulated annealing algorithm for the purpose of optimization, which is also generically applicable to binary optimization problems. Extensive simulation studies indicate that in dependent situations, our method performs significantly better than some existing popular conventional multiple testing methods, in terms of accuracy and power control. Moreover, application of our ideas to a real, spatial data set associated with radionuclide concentration in Rongelap islands yielded insightful results.

Highlights

  • In modern day practical statistical problems with many parameters we are seldom interested in testing only one hypothesis

  • As in the case of single hypothesis testing with well-known notions of Type-I and Type-II errors, the multiple testing literature consists of several measures of errors, for example, the family wise error rate (F W ER), which is the probability of rejecting any null, the false discovery rate (F DR), which is the expected proportion of false discoveries, and false non-discovery rate (F N R), the expected proportion of false non-discoveries

  • In a pathological example with 3 hypotheses we demonstrate that controlling modified positive Bayesian F DR (mpBF DR) yields larger P T D

Read more

Summary

Introduction

In modern day practical statistical problems with many parameters we are seldom interested in testing only one hypothesis. When the decisions are not directly (deterministically) dependent, information provided by the joint structure inherent in the hypotheses are somewhat neglected by the marginal multiple testing approaches, even though the data (and the prior in the Bayesian case) are dependently modelled. Using dependent decision rules should be helpful to rectify these kinds of errors if the information provided by the dependence is utilized judiciously In this regard, in this paper we develop a novel multiple testing procedure that coherently takes the dependence structure into consideration. In this context, we propose and develop a novel simulated annealing algorithm for optimization of the criterion for our non-marginal method; this algorithm, is applicable to any optimization problem consisting of binary variates. The “S” labelled equations and proofs of all our results are provided in the supplementary material

The basic multiple testing set-up
New error based criterion
A brief overview of error rates in multiple testing
A new Bayesian false discovery rate and its properties
Controlling F DR
Type-II Errors in Multiple Testing
Optimality of the non-marginal method with respect to the “0-1” loss function
Optimality when all the parameters are dependent
Optimality in the case of block dependent parameters
Interpretation of posterior mF DR as appropriate probabilities
Preliminaries for ensuring posterior convergence under general set-up
KL-divergence when all the parameters are dependent
KL-divergence minimization in case of block-dependent parameters
Practical issues on implementation of the non-marginal procedure
Choice of the penalizing constant β
Simulation study
The true data generating mechanism
The postulated Bayesian model and p-value computation
Comparison Scheme for Performance Comparison to Competing Methods
Validation of mpBF DR
Real data analysis: radionuclide concentrations at Rongelap Atoll
Choices of the threshold c
Implementation of the Bayesian non-marginal procedure
Results of multiple testing
Summary and conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call