Clinical Prediction Models in Epidemiological Studies: Lessons from the Application of QRISK3 to UK Biobank Data

Ruth E Parsons,David A Clifton,Lei Clifton,Glen Wright Colopy

doi:10.6339/22-jds1037

Abstract

Statistical models for clinical risk prediction are often derived using data from primary care databases; however, they are frequently used outside of clinical settings. The use of prediction models in epidemiological studies without external validation may lead to inaccurate results. We use the example of applying the QRISK3 model to data from the United Kingdom (UK) Biobank study to illustrate the challenges and provide suggestions for future authors. The QRISK3 model is recommended by the National Institute for Health and Care Excellence (NICE) as a tool to aid cardiovascular risk prediction in English and Welsh primary care patients aged between 40 and 74. QRISK3 has not been externally validated for use in studies where data is collected for more general scientific purposes, including the UK Biobank study. This lack of external validation is important as the QRISK3 scores of participants in UK Biobank have been used and reported in several publications. This paper outlines: (i) how various publications have used QRISK3 on UK Biobank data and (ii) the ways that the lack of external validation may affect the conclusions from these publications. We then propose potential solutions for addressing these challenges; for example, model recalibration and considering alternative models, for the application of traditional statistical models such as QRISK3, in cohorts without external validation.

Highlights

In clinical practice, risk prediction models, alongside a clinician’s judgement, can be used to decide the appropriate treatment program for a patient
Without considering the discrimination and calibration of the QRISK3 model applied to United Kingdom (UK) Biobank data, the accuracy of the cardiovascular disease (CVD) risk scores is unknown, and conclusions from this application may be misleading
Sun et al 2021 developed a polygenic risk score using genomic and traditional risk factor data from UK Biobank; they tested this model using 2.1 million individuals from Clinical Practice Research Datalink (CPRD) and found that the addition of these polygenic risk scores to traditional CVD risk factors could help prevent 7% more CVD events compared to using traditional risk factors alone if this technique was used at scale

Summary

Introduction

Risk prediction models, alongside a clinician’s judgement, can be used to decide the appropriate treatment program for a patient. Prediction models are often derived using data from primary care settings (including general practice, community pharmacy, dental, and optometry services). These models are often used outside of primary care clinical settings in epidemiological cohort studies (for example, to stratify data by the baseline risk of the cohort, as a predictive feature, or as a comparator predictive model). In populations different from what they were derived on, using the example of applying the QRISK3 cardiovascular disease risk prediction model to the UK Biobank participants. We recommend solutions for addressing the potentially inaccurate risk predictions

The QRISK3 Model

UK Biobank

Model Validation

Literature Review

Commentary on the Literature Review Findings

Recommendations for Future Research

Model Calibration

Model Discrimination

Alternative Models

Collecting the Most Appropriate Risk Factors

Conclusion

Findings

A Appendix

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Data Science	Publication Date: Jan 1, 2022
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Clinical Prediction Models in Epidemiological Studies: Lessons from the Application of QRISK3 to UK Biobank Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Data Science

Lead the way for us

Similar Papers

VISUALIZING HETEROGENEOUS TRAJECTORIES OF SUBJECTS WITH SURGICAL RELAPSE IN INFLAMMATORY BOWEL DISEASE FROM UK BIOBANK GENERAL PRACTICE DATA
Alberto Purpura ... Uri Kartoun
Inflammatory Bowel Diseases | VOL. 30
Alberto Purpura, et. al.Alberto Purpura ... Uri Kartoun
25 Jan 2024
Inflammatory Bowel Diseases | VOL. 30

Associations of sleep apnoea with glaucoma and age-related macular degeneration: an analysis in the United Kingdom Biobank and the Canadian Longitudinal Study on Aging
Anthony P Khawaja ... Xikun Han
BMC Medicine | VOL. 19
Anthony P Khawaja, et. al.Anthony P Khawaja ... Xikun Han
11 May 2021
BMC Medicine | VOL. 19

The rs738409 G Allele in PNPLA3 Is Associated With a Reduced Risk of COVID-19 Mortality and Hospitalization
Thomas Marjot ... Hamish Innes
Gastroenterology | VOL. 160
Thomas Marjot, et. al.Thomas Marjot ... Hamish Innes
27 Feb 2021
The rs738409 G Allele in PNPLA3 Is Associated With a Reduced Risk of COVID-19 Mortality and Hospitalization
Thomas Marjot ... Hamish Innes

Commute Patterns, Residential Traffic-Related Air Pollution, and Lung Cancer Risk in the Prospective UK Biobank Cohort Study
Bryan A Bassig ... Charles Breeze
SSRN Electronic Journal | VOL. -
Bryan A Bassig, et. al.Bryan A Bassig ... Charles Breeze
01 Jan 2020
SSRN Electronic Journal | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Clinical Prediction Models in Epidemiological Studies: Lessons from the Application of QRISK3 to UK Biobank Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Data Science