Abstract

Statistical dependence between hypotheses poses a significant challenge to the stability of large scale multiple hypotheses testing. Ignoring it often results in an unacceptably large spread in the false positive proportion even though the average value is acceptable (Fan et al., J Amer Statist Assoc 107(499): 1019-1035, 2012; Owen J R Stat Soc Ser B 67(3): 411–426, 2005; Qiu et al., Stat Appl Genet Mol Biol 4: 32, 2005 and Schwartzman and Lin Biometrika 98(1): 199–214, 2011). However, the statistical dependence structure of data is often unknown. Using a generic signal-processing model, Bayesian multiple testing, and simulations, we demonstrate that the variance of the false positive proportion can be substantially reduced even under unknown short range dependence. We do this by modeling the data generating process as a stationary ergodic binary signal process embedded in noisy observations. We derive conditional probabilities needed for the Bayesian multiple testing by incorporating nearby observations into a second order Taylor series approximation. Simulations under general conditions are carried out to assess the validity and the variance reduction of the approach. Along the way, we address the problem of sampling a random Markov matrix with specified stationary distribution and lower bounds on the top absolute eigenvalues, which is of interest in its own right.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.