Abstract

Clinical trials represent a critical milestone of translational and clinical sciences. However, poor recruitment to clinical trials has been a long standing problem affecting institutions all over the world. One way to reduce the cost incurred by insufficient enrollment is to minimize initiating trials that are most likely to fall short of their enrollment goal. Hence, the ability to predict which proposed trials will meet enrollment goals prior to the start of the trial is highly beneficial. In the current study, we leveraged a data set extracted from ClinicalTrials.gov that consists of 46,724 U.S. based clinical trials from 1990 to 2020. We constructed 4,636 candidate predictors based on data collected by ClinicalTrials.gov and external sources for enrollment rate prediction using various state-of-the-art machine learning methods. Taking advantage of a nested time series cross-validation design, our models resulted in good predictive performance that is generalizable to future data and stable over time. Moreover, information content analysis revealed the study design related features to be the most informative feature type regarding enrollment. Compared to the performance of models built with all features, the performance of models built with study design related features is only marginally worse (AUC = 0.78 ± 0.03 vs. AUC = 0.76 ± 0.02). The results presented can form the basis for data-driven decision support systems to assess whether proposed clinical trials would likely meet their enrollment goal.

Highlights

  • Clinical trials represent a critical milestone of translational and clinical science with the most direct impact potential for advancing healthcare related outcomes

  • Since we are interested in predicting low, medium, and high enrollment rates, we considered the following multi-class classification algorithms for constructing the predictive models: multinomial logistic regression, k-nearest neighbors (KNN), multinomial elastic net, support vector machine (SVM), and random forest

  • We considered the following feature sets: (1) population: the population from which participants can be recruited; (2) study center: population, facility count, and institution score; (3) study design: information related to the design of the clinical trial; (4) Medical Subject Headings (MeSH): characteristics regarding the medical domain of the trial; (5) complete: items (1)-(4)

Read more

Summary

Introduction

Clinical trials represent a critical milestone of translational and clinical science with the most direct impact potential for advancing healthcare related outcomes. Patient recruitment is a necessary condition of success for clinical trials Under specific situations such as the broad impact of COVID-19, rapid and high volume enrollment for vaccine trials is pivotal to global public health. This incurs very significant costs and wasted resource for the trial sponsors, scientists conducting the trials, and society at large.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call