Abstract

Missing data need to be addressed at each stage of developing, validating and implementing a clinical prediction model (CPM). However, no clear guidance exists on handling missing data across this pipeline, and it is unknown which methods are used in practice. We aimed to review and summarize the approaches to handling missing data that underly the CPMs currently recommended for use in UK healthcare. We identified eligible CPMs through discussions with National Institute for Health and Care Excellence (NICE), a call on Twitter, and contacting other research groups. We identified the paper corresponding to each model’s development and identified the ten most cited external validation papers. We extracted information on methods used for handling missing data, as well as reported strengths and limitations, and any stated assumptions. Twenty-three CPMs met the eligibility criteria. Three CPMs had consistent paths in their pipelines. Six missing data strategies were identified. 52% of the development articles and 48% of the validation articles did not report how missing data were handled. CCA was the most common approach used for both development (40%) and validation (44%). At implementation, 57% of the CPMs required complete data entry, whilst 43% allowed missing values (35% risk-factor-absent; 4% additional category for missingness; 4% imputation of mean values). A broad variety of methods for handling missing data underly the CPMs currently recommended for use in UK healthcare. Missing data handling strategies were inconsistent across the pipeline of a CPM development, external validation, and implementation. Better quality assurance of CPMs needs greater clarity and consistency in handling of missing data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call