Abstract

In randomized clinical trials with censored time-to-event outcomes, the logrank test is known to have substantial statistical power under the proportional hazards assumption and is widely adopted as a tool to compare two survival distributions. However, the proportional hazards assumption is impossible to validate in practice until the data are unblinded. However, the statistical analysis plan of a randomized clinical trial and in particular its primary analysis method must be pre-specified before any unblinded information may be reviewed. The purpose of this article is to guide applied biostatisticians in the prespecification of a desired primary analysis method when a treatment effect with nonproportional hazards is anticipated. While articles proposing alternate statistical tests are aplenty, to the best of our knowledge, there is no article available that attempts to simplify the choice and prespecification of a primary statistical test under specific expected patterns on nonproportional hazards. We provide such guidance by reviewing various tests proposed as more powerful alternatives to the standard logrank test under nonproportional hazards and simultaneously comparing their performance under a wide variety of nonproportional hazards scenarios to elucidate their advantages and disadvantages. In order to select the most preferable test for detecting specific differences between survival distributions of interest while controlling false positive rates, we review and assess the performance of weighted and adaptively weighted logrank tests, weighted and adaptively weighted Kaplan-Meier tests and versatile tests under various patterns of nonproportional hazards treatment effects through simulation. We validate some of the claimed properties of the proposed extensions and identify tests that may be more preferable under specific expected pattern of nonproportional hazards when such knowledge is available. We show that versatile tests, while achieving robustness to departures from proportional hazards, may lose interpretation of directionality (superiority or inferiority) and can only be seen to test departures from equality. Detailed summary and discussion of the performance of each test in terms of type I error rate and power are provided to formulate specific guidance about their applicability and use.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call