Background/aims Safety monitoring is a crucial requirement for Phase II and III clinical trials. To protect patients from toxicity risk, stopping rules may be implemented that will halt the study if an unexpectedly high number of events occur. These rules are constructed using statistical procedures that typically treat the toxicity data as binary occurrences. Because the exact dates of toxicities are often available, a strategy that handles these as time-to-event data may offer higher power and require less calendar time to identify excess risk. This work investigates several statistical methods for monitoring safety events as time-to-event endpoints and illustrates our R software package for designing and evaluating these procedures. Methods The performance metrics of safety stopping rules derived from Wang-Tsiatis tests, Bayesian Gamma–Poisson models, and sequential probability ratio tests are evaluated and contrasted in Phase II and III trial scenarios. We developed a publicly available R package “stoppingrule” for designing and assessing these stopping rules whose utility is illustrated through the design of a stopping rule for Blood and Marrow Transplant Clinical Trials Network 1204 (National Clinical Trial number NCT01998633), a multicenter, Phase II, single-arm trial that assessed the efficacy and safety of bone marrow transplant for the treatment of hemophagocytic lymphohistiocytosis and primary immune deficiencies. Results As seen previously in group sequential testing settings, rules with strict stopping criteria early in a study tend to have more lenient stopping criteria late in the trial. Consequently, methods with aggressive early monitoring, such as Gamma–Poisson models with weak priors and certain choices of truncated sequential probability ratio tests, usually yield a smaller number of toxicities and lower power than ones that are more permissive at early stages, such as Gamma–Poisson models with strong priors and the O’Brien–Fleming test. The Pocock test and maximized sequential probability ratio test performed contrary to these trends, however, exhibiting both diminished power and higher numbers of toxicities than other methods due to their extremely aggressive early stopping criteria, failing to reserve adequate power to identify safety issues beyond the start of the study. In contrast to binary toxicity approaches, our time-to-event methods offer meaningful reductions in expected toxicities of up to 20% across scenarios considered. Conclusion Safety monitoring procedures aim to guard study participants from being exposed to and suffering toxicity from unsafe treatments. Toward this end, we recommend considering the time-to-event-oriented Gamma–Poisson model—weak prior model or truncated sequential probability ratio test for constructing safety stopping rules, as they performed the best in minimizing the number of toxicities in our investigations. Our R package “stoppingrule” offers procedures for creating and assessing stopping rules to aid trial design.
Read full abstract