Abstract
When comparing the efficacy of two treatments in a clinical trial, or when following up two groups in an observational study, four outcomes are possible: 1) the study detects a “true” difference; 2) the study finds a difference, but there is no “true” difference (alpha error); 3) the study finds no difference, and there is none; and 4) the study demonstrates no difference, but there is a “true” difference (beta error). The P value indicates the probability of alpha error (outcome #1 vs #2), and is calculated at the study’s conclusion. The likelihood of beta error can be reduced before starting the study by power calculation. The statistical power of a study is influenced principally by the number of study participants and the size of the difference to be detected.Power calculations are used when planning a study to determine the likelihood that, if a predetermined clinically meaningful difference is present, it will be detected. The most contentious part of a power calculation is deciding what constitutes a clinically meaningful difference. A power of 80% or 90% to detect this difference generally is assumed to be sufficient to validate that there is no clinically meaningful difference between the two groups.In the study by Mekahli, et al described in this issue, renal outcomes were compared among children with autosomal dominant polycystic kidney disease diagnosed by prenatal ultrasound compared to those diagnosed only when they presented with symptoms. There were no differences detected between these two groups. This finding could be “true” (outcome #3 above) or false (outcome #4). Since the investigators did not report a power calculation, we do not know whether their study had adequate statistical power and sample size to detect a true difference between groups.How should readers use power calculations? In a study that demonstrates no differences between two treatments, check to see whether the authors include a power calculation. Lack of a power calculation represents an important weakness. However, once the study is done, it does not matter what the investigators believed they would find in the study design. What they actually found determines the usefulness of the study. This is best expressed using a 95% confidence interval, which uses data generated by the study to estimate a range of values likely to include the parameter of interest in a general population. Unfortunately, Mekahli, et al also did not report confidence intervals for the differences in outcomes between the two groups in this study.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have