Abstract

BackgroundThe ACC/AHA Pooled Cohort Equations (PCE) Risk Calculator is widely used in the US for primary prevention of atherosclerotic cardiovascular disease (ASCVD), but may under- or over-estimate risk in some populations. We therefore designed an automated, population-specific ASCVD risk calculator using machine-learning (ML) methods and electronic medical record (EMR) data, and compared its predictive power with that of the PCE calculator. Methods and FindingsWe collected data from 101,110 unique EMRs of living patients from January 1, 2009 to April 30, 2020. ML techniques were applied to patient datasets that included either only cross-sectional (CS) features, or CS combined with longitudinal (LT) features derived from vital statistics and laboratory values. We compared the utility of the models using a proposed new cost measure (Screened Cases Percentage @ Sensitivity level).All ML models tested achieved better predictive power than the PCE risk calculator. The random forest ML technique (RF) applied on the combination of CS and LT features (RF-LTC) produced the best area under curve (AUC) score of 0.902 (95% confidence interval (CI), 0.895–0.910). To detect 90% of all positive ASCVD cases, the best ML model required screening only 43% of patients, while the PCE risk calculator required screening 69% of patients. ConclusionsPrediction models built using ML techniques improved ASCVD prediction and reduced the number of screenings required to predict ASCVD when compared with the PCE calculator, alone. Combining LT and CS features in the ML models significantly improved ASCVD prediction compared with using CS features, alone.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call