Abstract

Prediction of cardiovascular disease (CVD) is important in clinical practice. Machine learning (ML) may offer an improved alternative to current CVD risk stratification in individual patients. We aim to identify important predictors and compare ML models with traditional models according to their prediction performance in a large long-term follow-up cohort. The Atherosclerosis Risk in Communities (ARIC) study was designed to study the progression of subclinical disease to cardiovascular events over a 25-year follow-up period. All phenotypic variables at visit 1 were obtained. All-cause death, CVD, and coronary heart disease were the outcomes for analysis. The ML framework involved variable selection using the random survival forest (RSF) method, model building, and 5-fold cross-validation. Model performance was evaluated by discrimination using the Harrell concordance index (C-index), accuracy using the Brier score (BS), and interpretability using the number of variables in the model. Of the 14,842 participants in ARIC, the average age was 54.2 years, with 45.2% male and 26.2% Black participants. Thirty-eight unique variables were selected in the RSF top 20 importance ranking of all 6 outcomes. Aging, hypertension, glucose metabolism, renal function, coagulation, adiposity, and sodium retention dominated the predictions of all outcomes. The ML models outperformed the regression models and established risk scores with a higher C-index, lower BS, and varied interpretability. The ML framework is useful for identifying important predictors of CVD and for developing models with robust performance compared with existing risk models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.