Slow patient enrollment or failing to enroll the required number of patients is a disruptor of clinical trial timelines. To meet the planned trial recruitment, site selection strategies are used during clinical trial planning to identify research sites that are most likely to recruit a sufficiently high number of subjects within trial timelines. We developed a machine learning approach that outperforms baseline methods to rank research sites based on their expected recruitment in future studies. Indication level historical recruitment and real-world data are used in the machine learning approach to predict patient enrollment at site level. We define covariates based on published recruitment hypotheses and examine the effect of these covariates in predicting patient enrollment. We compare model performance of a linear and a non-linear machine learning model with common industry baselines that are constructed from historical recruitment data. Performance of the methodology is evaluated and reported for two disease indications, inflammatory bowel disease and multiple myeloma, both of which are actively being pursued in clinical development. We validate recruitment hypotheses by reviewing the covariates relationship with patient recruitment. For both indications, the non-linear model significantly outperforms the baselines and the linear model on the test set. In this paper, we present a machine learning approach to site selection that incorporates site-level recruitment and real-world patient data. The model ranks research sites by predicting the number of recruited patients and our results suggest that the model can improve site ranking compared to common industry baselines.
Read full abstract