Abstract

The log-rank test is most powerful under proportional hazards (PH). In practice, non-PH patterns are often observed in clinical trials, such as in immuno-oncology; therefore, alternative methods are needed to restore the efficiency of statistical testing. Three categories of testing methods were evaluated, including weighted log-rank tests, Kaplan–Meier curve-based tests (including weighted Kaplan–Meier and restricted mean survival time), and combination tests (including Breslow test, Lee’s combo test, and MaxCombo test). Nine scenarios representing the PH and various non-PH patterns were simulated. The power, Type I error, and effect estimate of each method were compared. In general, all tests control Type I error well. There is not a single most powerful test across all scenarios. In the absence of prior knowledge regarding the underlying or non-PH patterns, the MaxCombo test is relatively robust across patterns. Since the treatment effect changes over time under non-PH, the overall profile of the treatment effect may not be represented comprehensively based on a single measure. Thus, multiple measures of the treatment effect should be prespecified as sensitivity analyses to describe the totality of the data. Supplementary materials for this article are available online.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.