BackgroundIn group-sequential designs, it is typically assumed that there is no time gap between patient enrollment and outcome measurement in clinical trials. However, in practice, there is usually a lag between the two time points. This can affect the statistical analysis of the data, especially in trials with interim analyses. One approach to address delayed responses has been introduced by Hampson and Jennison (J R Stat Soc Ser B Stat Methodol 75:3-54, 2013), who proposed the use of error-spending stopping boundaries for patient enrollment, followed by critical values to reject the null hypothesis if the stopping boundaries are crossed beforehand. Regarding the choice of a trial design, it is important to consider the efficiency of trial designs, e.g. in terms of the probability of trial success (power) and required resources (sample size and time).MethodsThis article aims to shed more light on the performance comparison of group sequential clinical trial designs that account for delayed responses and designs that do not. Suitable performance measures are described and designs are evaluated using the R package rpact. By doing so, we provide insight into global performance measures, discuss the applicability of conditional performance characteristics, and finally whether performance gain justifies the use of complex trial designs that incorporate delayed responses.ResultsWe investigated how the delayed response group sequential test (DR-GSD) design proposed by Hampson and Jennison (J R Stat Soc Ser B Stat Methodol 75:3-54, 2013) can be extended to include nonbinding lower recruitment stopping boundaries, illustrating that their original design framework can accommodate both binding and nonbinding rules when additional constraints are imposed. Our findings indicate that the performance enhancements from methods incorporating delayed responses heavily rely on the sample size at interim and the volume of data in the pipeline, with overall performance gains being limited.ConclusionThis research extends existing literature on group-sequential designs by offering insights into differences in performance. We conclude that, given the overall marginal differences, discussions regarding appropriate trial designs can pivot towards practical considerations of operational feasibility.