Introduction: Optimal statin treatment decisions for primary prevention of atherosclerotic cardiovascular disease (ASCVD) rely on shared decision-making between patient and provider. We sought to develop a machine learning-based algorithm to personalize cholesterol treatment decisions using electronic medical record (EMR) data. Methods: We included EMR data for adults aged 40 to 79 with no prior ASCVD or statin therapy from an outpatient Northern California system between January 1, 2009 and December 31, 2018 with at least two visits at least 1 year apart and at least two low density lipoprotein cholesterol (LDL-C) values. The outcome was the LDL-C measured closest to one year after a patient’s second visit. We modeled four different treatment decisions: no statin use, low-intensity statin use, moderate-intensity statin use, and high-intensity statin use. We trained weighted-K-nearest-neighbor (wKNN) regression models to identify similar patients using each line of therapy to a candidate patient. The algorithm compared outcomes of these similar patients and recommended the treatment which predicted the lowest LDL-C after one year. Results: Our study cohort consisted of 50,911 patients (age 54.6 ± 9.84 years, baseline LDL-C 122 ± 34.2 mg/dL, follow-up LDL-C 121 ± 35.9 mg/dL) including 54% female, 47% Non-Hispanic White, 32% Asian, and 7.5% Hispanic patients. Among 8,551 test patients visiting in 2015 or later, 96.9%, 3.08%, and 0.05% were recommended to begin high-intensity, moderate-intensity, and low-intensity statins, respectively. With these recommendations, the LDL-C values at 1-year follow-up were predicted to be 21.5 ± 43.5 mg/dL (17.6%) lower per patient, on average (Figure). Conclusions: EMR-trained wKNN models are able to determine patient LDL-C trajectories under different lines of statin therapy. Machine learning models leveraging real-world datasets may provide useful statin therapy treatment recommendations for primary ASCVD prevention.
Read full abstract