Abstract

In today's information age, the necessary means exist for clinical risk prediction to capitalize on a multitude of data sources, increasing the potential for greater accuracy and improved patient care. Towards this objective, the Prostate Cancer DREAM Challenge posted comprehensive information from three clinical trials recording survival for patients with metastatic castration-resistant prostate cancer treated with first-line docetaxel. A subset of an independent clinical trial was used for interim evaluation of model submissions, providing critical feedback to participating teams for tailoring their models to the desired target. Final submitted models were evaluated and ranked on the independent clinical trial. Our team, called "A Bavarian Dream", utilized many of the common statistical methods for data dimension reduction and summarization during the trial. Three general modeling principles emerged that were deemed helpful for building accurate risk prediction tools and ending up among the winning teams of both sub-challenges. These principles included: first, good data, encompassing the collection of important variables and imputation of missing data; second, wisdom of the crowd, extending beyond the usual model ensemble notion to the inclusion of experts on specific risk ranges; and third, recalibration, entailing transfer learning to the target source. In this study, we illustrate the application and impact of these principles applied to data from the Prostate Cancer DREAM Challenge.

Highlights

  • Government funded clinical and research trials are currently experiencing increased pressure to publish comprehensive anonymized data in order to maximize scientific output, ushering in new challenges and opportunities for data scientists[1]

  • First concept: good data Figure 1 gives an overview of the Prostate Cancer Dialogue for Reverse Engineering Assessments and Methods (DREAM) Challenge data after some cleaning but before inclusion of additional variables

  • The benefit of applying each of the principles on the iAUC and root mean squared error (RMSE) in the Prostate Cancer DREAM Challenge is quantified in Figure 7 and Figure 8, respectively

Read more

Summary

Introduction

Government funded clinical and research trials are currently experiencing increased pressure to publish comprehensive anonymized data in order to maximize scientific output, ushering in new challenges and opportunities for data scientists[1]. In an era of personalized medicine, scientists analyzing the results of large population-based clinical and prevention trials are further encouraged to translate results to clinical practice. With patient as the consumer, this push has led to an explosion of easy-to-use online clinical risk prediction tools for most types of clinical outcomes[2,3]. Single-study prediction models dominated out of convenience. Multiple studies are available that can be combined, increasing accuracy through the wisdom-of-the-crowd philosophy, and providing more realistic estimates of variability for decision-making. Ensembles or collections of models have been shown to outperform top-nominated models[4]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call