Abstract

Summary When testing treatment effects in multi-arm clinical trials, the Bonferroni method or the method of Simes 1986) is used to adjust for the multiple comparisons. When control of the family-wise error rate is required, these methods are combined with the close testing principle of Marcus et al. (1976). Under weak assumptions, the resulting p-values all give rise to valid tests provided that the basic test used for each treatment is valid. However, standard tests can be far from valid, especially when the endpoint is binary and when sample sizes are unbalanced, as is common in multi-arm clinical trials. This paper looks at the relationship between size deviations of the component test and size deviations of the multiple comparison test. The conclusion is that multiple comparison tests are as imperfect as the basic tests at nominal size α/m where m is the number of treatments. This, admittedly not unexpected, conclusion implies that these methods should only be used when the component test is very accurate at small nominal sizes. For binary end-points, this suggests use of the parametric bootstrap test. All these conclusions are supported by a detailed numerical study.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call