Abstract

Background: Coronary artery calcium score (CACS) is a reliable predictor for future cardiovascular disease risk. Although deep learning studies using computed tomography (CT) images to predict CACS have been reported, no study has assessed the feasibility of machine learning (ML) algorithms to predict the CACS using clinical variables in a healthy general population. Therefore, we aimed to assess whether ML algorithms other than binary logistic regression (BLR) could predict high CACS in a healthy population with general health examination data. Methods: This retrospective observational study included participants who had regular health screening including coronary CT angiography. High CACS was defined by the Agatston score ≥ 100. Univariable and multivariable BLR was performed to assess predictors for high CACS in the entire dataset. When performing ML prediction for high CACS, the dataset was randomly divided into a training and test dataset with a 7:3 ratio. BLR, catboost, and xgboost algorithms with 5-fold cross-validation and grid search technique were used to find the best performing classifier. Performance comparison of each ML algorithm was evaluated with the area under the receiver operating characteristic (AUROC) curve. Results: A total of 2133 participants were included in the final analysis. Mean age and proportion of male sex were 55.4 ± 11.3 years and 1483 (69.5%), respectively. In multivariable BLR analysis, age (odds ratio [OR], 1.12; 95% confidence interval [CI], 1.10–1.15, p < 0.001), male sex (OR, 2.91; 95% CI, 1.57–5.38, p < 0.001), systolic blood pressure (OR, 1.02; 95% CI, 1.00–1.03, p = 0.019), and low-density lipoprotein cholesterol (OR, 1.00; 95% CI, 0.99–1.00, p = 0.047) were significant predictors for high CACS. Performance in predicting high CACS of xgboost was AUROC of 0.823, followed by catboost (0.750) and BLR (0.585). The comparison of AUROC between xgboost and BLR was significant (p for AUROC comparison < 0.001). Conclusions: Xgboost ML algorithm was found to be a more reliable predictor of CACS in healthy participants compared to the BLR algorithm. ML algorithms may be useful for predicting CACS with only laboratory data in healthy participants.

Highlights

  • Cardiovascular disease (CVD) is one of the leading causes of death worldwide [1]

  • Predictors in binary logistic regression (BLR) analysis were represented with odds ratio (OR) and 95% confidence interval (CI)

  • The performance of each machine learning (ML) algorithm was measured by the area under the receiver operating characteristic (AUROC) curve

Read more

Summary

Introduction

Cardiovascular disease (CVD) is one of the leading causes of death worldwide [1]. Inflammation of the vascular smooth muscle cell results in increased calcium deposits that develops into atherosclerotic plaque on the internal wall of the coronary artery. Coronary artery calcium score (CACS) in computed tomography (CT) is an important predictor for future CVD development and mortality in the general population [3,4,5]. Coronary artery calcium score (CACS) is a reliable predictor for future cardiovascular disease risk. We aimed to assess whether ML algorithms other than binary logistic regression (BLR) could predict high CACS in a healthy population with general health examination data. In multivariable BLR analysis, age (odds ratio [OR], 1.12; 95% confidence interval [CI], 1.10–1.15, p < 0.001), male sex (OR, 2.91; 95% CI, 1.57–5.38, p < 0.001), systolic blood pressure (OR, 1.02; 95% CI, 1.00–1.03, p = 0.019), and low-density lipoprotein cholesterol (OR, 1.00; 95% CI, 0.99–1.00, p = 0.047) were significant predictors for high CACS. ML algorithms may be useful for predicting CACS with only laboratory data in healthy participants

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call