Reverse Bayesian Implications of p-Values Reported in Critical Care Randomized Trials.

Sarah Nostedt,Ari R Joffe

doi:10.1177/08850666211053793

Abstract

BackgroundMisinterpretations of the p-value in null-hypothesis statistical testing are common. We aimed to determine the implications of observed p-values in critical care randomized controlled trials (RCTs).MethodsWe included three cohorts of published RCTs: Adult-RCTs reporting a mortality outcome, Pediatric-RCTs reporting a mortality outcome, and recent Consecutive-RCTs reporting p-value ≤.10 in six higher-impact journals. We recorded descriptive information from RCTs. Reverse Bayesian implications of obtained p-values were calculated, reported as percentages with inter-quartile ranges.ResultsObtained p-value was ≤.005 in 11/216 (5.1%) Adult-RCTs, 2/120 (1.7%) Pediatric-RCTs, and 37/90 (41.1%) Consecutive-RCTs. An obtained p-value .05–.0051 had high False Positive Rates; in Adult-RCTs, minimum (assuming prior probability of the alternative hypothesis was 50%) and realistic (assuming prior probability of the alternative hypothesis was 10%) False Positive Rates were 16.7% [11.2, 21.8] and 64.3% [53.2, 71.4]. An obtained p-value ≤.005 had lower False Positive Rates; in Adult-RCTs the realistic False Positive Rate was 7.7% [7.7, 16.0]. The realistic probability of the alternative hypothesis for obtained p-value .05–.0051 (ie, Positive Predictive Value) was 28.0% [24.1, 34.8], 30.6% [27.7, 48.5], 29.3% [24.3, 41.0], and 32.7% [24.1, 43.5] for Adult-RCTs, Pediatric-RCTs, Consecutive-RCTs primary and secondary outcome, respectively. The maximum Positive Predictive Value for p-value category .05–.0051 was median 77.8%, 79.8%, 78.8%, and 81.4% respectively. To have maximum or realistic Positive Predictive Value >90% or >80%, RCTs needed to have obtained p-value ≤.005. The credibility of p-value .05–.0051 findings were easy to challenge, and the credibility to rule-out an effect with p-value >.05 to .10 was low. The probability that a replication study would obtain p-value ≤.05 did not approach 90% unless the obtained p-value was ≤.005.ConclusionsUnless the obtained p-value was ≤.005, the False Positive Rate was high, and the Positive Predictive Value and probability of replication of “statistically significant” findings were low.

Full Text