Abstract

BackgroundThe design used to create labelled data for training prediction models from observational healthcare databases (e.g., case-control and cohort) may impact the clinical usefulness. We aim to investigate hypothetical design issues and determine how the design impacts prediction model performance.AimTo empirically investigate differences between models developed using a case-control design and a cohort design.MethodsUsing a US claims database, we replicated two published prediction models (dementia and type 2 diabetes) which were developed using a case-control design, and trained models for the same prediction questions using cohort designs. We validated each model on data mimicking the point in time the models would be applied in clinical practice. We calculated the models’ discrimination and calibration-in-the-large performances.ResultsThe dementia models obtained area under the receiver operating characteristics of 0.560 and 0.897 for the case-control and cohort designs respectively. The type 2 diabetes models obtained area under the receiver operating characteristics of 0.733 and 0.727 for the case-control and cohort designs respectively. The dementia and diabetes case-control models were both poorly calibrated, whereas the dementia cohort model achieved good calibration. We show that careful construction of a case-control design can lead to comparable discriminative performance as a cohort design, but case-control designs over-represent the outcome class leading to miscalibration.ConclusionsAny case-control design can be converted to a cohort design. We recommend that researchers with observational data use the less subjective and generally better calibrated cohort design when extracting labelled data. However, if a carefully constructed case-control design is used, then the model must be prospectively validated using a cohort design for fair evaluation and be recalibrated.

Highlights

  • The design used to create labelled data for training prediction models from observational healthcare databases may impact the clinical usefulness

  • We argued that using a cohort design to extract labelled data for developing prediction models is preferred and overcomes bias and clinical application issues that can plague the case-control design

  • We replicated two published prediction models developed using a case-control design and showed that these models could have been developed with a cohort design

Read more

Summary

Introduction

The design used to create labelled data for training prediction models from observational healthcare databases (e.g., case-control and cohort) may impact the clinical usefulness. A recent review of prognostic models for cardiovascular outcomes showed that the number of models being published is increasing over time, but most published models have issues (e.g., methodology details missing in publication, lack of external validation and standard performance measures not used) [2]. This problem is observed across outcomes where many models fail to adhere to best practices for model development and reporting [2,3,4]. There may be even bigger problems with some prognostic models developed on observational data due to the process used to create labelled data for the machine learning algorithms

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call