Bayesian prediction intervals for assessing P-value variability in prospective replication studies

Olga Vsevolozhskaya,Gabriel Ruiz,Dmitri Zaykin

doi:10.1038/s41398-017-0024-3

Abstract

Increased availability of data and accessibility of computational tools in recent years have created an unprecedented upsurge of scientific studies driven by statistical analysis. Limitations inherent to statistics impose constraints on the reliability of conclusions drawn from data, so misuse of statistical methods is a growing concern. Hypothesis and significance testing, and the accompanying P-values are being scrutinized as representing the most widely applied and abused practices. One line of critique is that P-values are inherently unfit to fulfill their ostensible role as measures of credibility for scientific hypotheses. It has also been suggested that while P-values may have their role as summary measures of effect, researchers underappreciate the degree of randomness in the P-value. High variability of P-values would suggest that having obtained a small P-value in one study, one is, ne\tvertheless, still likely to obtain a much larger P-value in a similarly powered replication study. Thus, “replicability of P-value” is in itself questionable. To characterize P-value variability, one can use prediction intervals whose endpoints reflect the likely spread of P-values that could have been obtained by a replication study. Unfortunately, the intervals currently in use, the frequentist P-intervals, are based on unrealistic implicit assumptions. Namely, P-intervals are constructed with the assumptions that imply substantial chances of encountering large values of effect size in an observational study, which leads to bias. The long-run frequentist probability provided by P-intervals is similar in interpretation to that of the classical confidence intervals, but the endpoints of any particular interval lack interpretation as probabilistic bounds for the possible spread of future P-values that may have been obtained in replication studies. Along with classical frequentist intervals, there exists a Bayesian viewpoint toward interval construction in which the endpoints of an interval have a meaningful probabilistic interpretation. We propose Bayesian intervals for prediction of P-value variability in prospective replication studies. Contingent upon approximate prior knowledge of the effect size distribution, our proposed Bayesian intervals have endpoints that are directly interpretable as probabilistic bounds for replication P-values, and they are resistant to selection bias. We showcase our approach by its application to P-values reported for five psychiatric disorders by the Psychiatric Genomics Consortium group.

Highlights

Poor replicability has been plaguing observational studies
The observed P-value was based on a twosample Z-test and was thresholded according to the following selection rules: (i) no selection, i.e., a prediction interval is constructed for a randomly observed P-value; (ii) selection of P-values around a value, e.g., P % 0:05, i.e., prediction intervals are constructed only for P-values that were close to the 5% significance level; (iii) selection of P-values that are smaller than a threshold, e.g., P
Researchers want to know whether a statistic used for summarizing their data supports their scientific hypothesis and to what degree

Summary

Introduction

Poor replicability has been plaguing observational studies. The “replicability crisis” is largely statistical and while there are limits to what statistics can do, a serious concernVsevolozhskaya et al Translational Psychiatry (2017)7:1271 introduced by Killeen[1], to mean the P-value obtained from subsequent, replicate experiments with the same sample size, taken from the same population. It has been suggested that P-intervals may serve the purpose of improving P-value interpretability, especially in largescale genomic studies with many tests or in other studies utilizing modern high-throughput technologies[5,6] In their discussion of P-values and their prediction intervals, Lazzeroni and colleagues[6] argued that the P-values “will continue to have an important role in research” and that “no other statistic fills this particular niche.”. They present P-intervals not as a way to expose alleged weaknesses of P-values but rather as a tool for assessing the real uncertainty inherent in P-values In their discussion of P-values and their prediction intervals, Lazzeroni and colleagues[6] argued that the P-values “will continue to have an important role in research” and that “no other statistic fills this particular niche.” They present P-intervals not as a way to expose alleged weaknesses of P-values but rather as a tool for assessing the real uncertainty inherent in P-values

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Translational Psychiatry	Publication Date: Dec 1, 2017
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

Bayesian prediction intervals for assessing P-value variability in prospective replication studies

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Translational Psychiatry

Lead the way for us

Similar Papers

Publishing replication studies to support excellence in physiological research.
Paul Mcloughlin ... Gordon Drummond
Experimental physiology | VOL. 102
Paul Mcloughlin, et. al.Paul Mcloughlin ... Gordon Drummond
30 Aug 2017
Experimental physiology | VOL. 102

P-values in genomics: Apparent precision masks high uncertainty
L C Lazzeroni ... Y Lu
Molecular Psychiatry | VOL. 19
L C Lazzeroni, et. al.L C Lazzeroni ... Y Lu
14 Jan 2014
Molecular Psychiatry | VOL. 19

Coronary Heart Disease Risk Prediction in the Era of Genome-Wide Association Studies
Steve E Humphries ... Gie Ken-Dror
Circulation | VOL. 121
Steve E Humphries, et. al.Steve E Humphries ... Gie Ken-Dror
24 May 2010
Circulation | VOL. 121

Some considerations on target estimands for health technology assessment.
Antonio Remiro‐Azócar
Statistics in medicine | VOL. 41
Antonio Remiro‐AzócarAntonio Remiro‐Azócar
17 Nov 2022
Statistics in medicine | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bayesian prediction intervals for assessing P-value variability in prospective replication studies

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Translational Psychiatry