Abstract
Introduction: Patients with atherosclerotic cardiovascular disease (ASCVD) have high risk for recurrent ASCVD events despite statin use. Pooled cohort equations (PCE) are used for ASCVD risk prediction in primary prevention but there are no validated models for recurrent risk prediction in secondary prevention. Machine learning (ML) demonstrates promise in developing novel risk prediction models using electronic health record (EHR) data. Methods: We included adults with prior ASCVD from EHR data from an outpatient Northern California system between January 1, 2009 and December 31, 2018 with at least 2 visits at least 1 year apart and 5 years of follow up. The outcome was a recurrent ASCVD event defined as the first myocardial infarction, stroke, or fatal coronary artery disease in the 5 year follow-up period. We trained ML models to predict recurrent ASCVD risk: random forests (RF), gradient boosted machines (GBM), extreme gradient boosted models (XGBoost), and logistic regression with a standard L 2 penalty (LR) and an L 1 penalty (Lasso). We evaluated performance of ML models and the PCE on a 20% held-out test cohort using the areas under the receiver operating characteristic curves (AUCs). Results: Our cohort consisted of 32,192 patients with ASCVD (Mean age 70 years, 46% women, 12% Asian and 6% Hispanic). Less than half (49%) were on guideline directed statins. XGBoost and GBM were the best performing models for recurrent ASCVD risk prediction, while the PCE performed poorly (Figure). The top 20 predictive variables for recurrent ASCVD risk included prior events (ischemic stroke, myocardial infarction), traditional risk factors (age, blood pressure, lipid levels) and socioeconomic factors (income, education). Conclusions: EHR-trained machine learning models facilitated recurrent ASCVD risk prediction in real-world secondary prevention patients. Machine learning models developed from large datasets may help bridge contemporary gaps in ASCVD risk prediction.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have