This paper proposes a platform trial for conducting A/B tests with multiple arms and interim monitoring to investigate the impact of several factors on the expected sample size and probability of early stopping. We examined the performance of three stopping boundaries: O’Brien Fleming (OBF) stopping for either futility or difference (both), Pocock stopping for futility only, and fixed sample size design. We simulated twelve scenarios of different orders of arms based on various effect sizes, as well as considered 1 or 3 interim looks. The overall findings are summarizing in a flowchart to provide intuitive guidance for the design of the platform based on the simulation. We found that it is better to use OBF stopping for both if there is any effective variant and the trial is sufficiently powered to detect the expected effect size. If the study is underpowered to detect a difference, we recommend fixed sample size design to gather as much information as possible, however if the expected sample size is important to minimize, we recommend using Pocock boundaries with futility monitoring. Our results aimed at helping high-tech companies conduct their own studies without requiring extensive knowledge of clinical trial design and statistical methodology.
Read full abstract