Abstract

Statistical models for clinical risk prediction are often derived using data from primary care databases; however, they are frequently used outside of clinical settings. The use of prediction models in epidemiological studies without external validation may lead to inaccurate results. We use the example of applying the QRISK3 model to data from the United Kingdom (UK) Biobank study to illustrate the challenges and provide suggestions for future authors. The QRISK3 model is recommended by the National Institute for Health and Care Excellence (NICE) as a tool to aid cardiovascular risk prediction in English and Welsh primary care patients aged between 40 and 74. QRISK3 has not been externally validated for use in studies where data is collected for more general scientific purposes, including the UK Biobank study. This lack of external validation is important as the QRISK3 scores of participants in UK Biobank have been used and reported in several publications. This paper outlines: (i) how various publications have used QRISK3 on UK Biobank data and (ii) the ways that the lack of external validation may affect the conclusions from these publications. We then propose potential solutions for addressing these challenges; for example, model recalibration and considering alternative models, for the application of traditional statistical models such as QRISK3, in cohorts without external validation.

Highlights

  • In clinical practice, risk prediction models, alongside a clinician’s judgement, can be used to decide the appropriate treatment program for a patient

  • Without considering the discrimination and calibration of the QRISK3 model applied to United Kingdom (UK) Biobank data, the accuracy of the cardiovascular disease (CVD) risk scores is unknown, and conclusions from this application may be misleading

  • Sun et al 2021 developed a polygenic risk score using genomic and traditional risk factor data from UK Biobank; they tested this model using 2.1 million individuals from Clinical Practice Research Datalink (CPRD) and found that the addition of these polygenic risk scores to traditional CVD risk factors could help prevent 7% more CVD events compared to using traditional risk factors alone if this technique was used at scale

Read more

Summary

Introduction

Risk prediction models, alongside a clinician’s judgement, can be used to decide the appropriate treatment program for a patient. Prediction models are often derived using data from primary care settings (including general practice, community pharmacy, dental, and optometry services). These models are often used outside of primary care clinical settings in epidemiological cohort studies (for example, to stratify data by the baseline risk of the cohort, as a predictive feature, or as a comparator predictive model). In populations different from what they were derived on, using the example of applying the QRISK3 cardiovascular disease risk prediction model to the UK Biobank participants. We recommend solutions for addressing the potentially inaccurate risk predictions

The QRISK3 Model
UK Biobank
Model Validation
Literature Review
Commentary on the Literature Review Findings
Recommendations for Future Research
Model Calibration
Model Discrimination
Alternative Models
Collecting the Most Appropriate Risk Factors
Conclusion
Findings
A Appendix
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call