In this article, I discuss the potential pitfalls of interpreting p values, confidence intervals, and declarations of statistical significance. To illustrate the issues, I discuss the LOVIT trial, which compared high-dose vitamin C with placebo in mechanically ventilated patients with sepsis. The primary outcome – the proportion of patients who died or had persisting organ dysfunction at day 28 – was significantly higher in patients who received vitamin C (p = .01). The authors had hypothesized that vitamin C would have a beneficial effect, although the prior evidence for benefit was weak. There was no prior evidence for a harmful effect of high-dose vitamin C. Consequently, the pretest probability for harm was low. The sample size was calculated assuming a 10% absolute risk difference, which was optimistic. Overestimating the effect size when calculating the sample size leads to low power. For these reasons, we should be skeptical that vitamin C causes harm in septic patients, despite the significant result. p-values and confidence intervals are probabilities concerning the chance of obtaining the observed data. However, we are more interested in the chance the intervention has a real effect on the outcome. That is to say, we are more interested in whether the hypothesis is true. A Bayesian approach allows us to estimate the false positive risk, which is the post-test probability there is no effect of the intervention. The false positive risk for the LOVIT trial (calculated from the published summary data using uniform priors for the parameter values) is 70%. Most likely, high-dose vitamin C does not cause harm in septic patients. Most likely it has no effect at all. If there is an effect, it is probably small and most likely beneficial.
Read full abstract