Application of ensemble machine learning algorithms on lifestyle factors and wearables for cardiovascular risk prediction

Weiting Huang,Ong Eng Hock Marcus,Woon Loong Calvin Chin,Tan Wei Ying,Lohendran Baskaran,Khung Keong Yeo,Ng See Kiong

doi:10.1038/s41598-021-04649-y

Abstract

This study looked at novel data sources for cardiovascular risk prediction including detailed lifestyle questionnaire and continuous blood pressure monitoring, using ensemble machine learning algorithms (MLAs). The reference conventional risk score compared against was the Framingham Risk Score (FRS). The outcome variables were low or high risk based on calcium score 0 or calcium score 100 and above. Ensemble MLAs were built based on naive bayes, random forest and support vector classifier for low risk and generalized linear regression, support vector regressor and stochastic gradient descent regressor for high risk categories. MLAs were trained on 600 Southeast Asians aged 21 to 69 years free of cardiovascular disease. All MLAs outperformed the FRS for low and high-risk categories. MLA based on lifestyle questionnaire only achieved AUC of 0.715 (95% CI 0.681, 0.750) and 0.710 (95% CI 0.653, 0.766) for low and high risk respectively. Combining all groups of risk factors (lifestyle survey questionnaires, clinical blood tests, 24-h ambulatory blood pressure and heart rate monitoring) along with feature selection, prediction of low and high CVD risk groups were further enhanced to 0.791 (95% CI 0.759, 0.822) and 0.790 (95% CI 0.745, 0.836). Besides conventional predictors, self-reported physical activity, average daily heart rate, awake blood pressure variability and percentage time in diastolic hypertension were important contributors to CVD risk classification.

Highlights

This study looked at novel data sources for cardiovascular risk prediction including detailed lifestyle questionnaire and continuous blood pressure monitoring, using ensemble machine learning algorithms (MLAs)
The aim of this paper is to investigate the additive value of four groups of risk factors, based on ease of information availability and regular clinical workflow, using ensemble MLA, in cardiovascular risk prediction
Since no single metric can objectively evaluate the cardiovascular risk prediction, we evaluate the performance of our models at CVD risk class level using a panel of metrics; sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F1-score and Area under Receiver Operating Characteristic curves (AUC)

Summary

Introduction

This study looked at novel data sources for cardiovascular risk prediction including detailed lifestyle questionnaire and continuous blood pressure monitoring, using ensemble machine learning algorithms (MLAs). Abbreviations ALT Alanine aminotransferase AST Aspartate transaminase AUC Area under receiver operating characteristics ARV Average real variability BP Blood pressure CAC Coronary artery calcium CVD Cardiovascular disease FPR False positive rates FRS Framingham risk score HDL High-density lipoprotein cholesterol LDL Low-density lipoprotein cholesterol MLA Machine learning algorithms NHCS National Heart Centre Singapore NPV Negative predictive values PPV Positive predictive values ROC Receiver operating characteristics SMOTE Synthetic minority oversampling technique TPR True positive rates singhealth.com.sg. The INTERHEART study found that nine risk factors including smoking, history of hypertension or diabetes, waist/hip ratio, dietary patterns, physical activity, consumption of alcohol, blood apolipoproteins (Apo), and psychosocial factors, accounted for 90% of the population attributable risk for myocardial infarction in men and 94% in women. Due to the diverse data sources, and data types including time series, an integrated assessment tool combining lifestyle, diet, ambulatory physiological parameters, and clinical risk markers have not been performed to our knowledge

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Jan 20, 2022
Citations: 25	License type: open-access

R Discovery Prime

R Discovery Prime

Application of ensemble machine learning algorithms on lifestyle factors and wearables for cardiovascular risk prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Spam Mail Classification Using Ensemble and Non-Ensemble Machine Learning Algorithms
Khyati Agarwal ... Varun Dutt
-
Khyati Agarwal, et. al.Khyati Agarwal ... Varun Dutt
23 Oct 2020
23 Oct 2020

How Accurate Are 3 Risk Prediction Models in US Women?
Erin D Michos ... Roger S Blumenthal
Circulation | VOL. 125
Erin D Michos, et. al.Erin D Michos ... Roger S Blumenthal
07 Mar 2012
Circulation | VOL. 125

“Risky Business”
Vijay Nambi ... Christie M Ballantyne
Circulation | VOL. 119
Vijay Nambi, et. al.Vijay Nambi ... Christie M Ballantyne
26 Jan 2009
Circulation | VOL. 119

Improving coronary heart disease risk assessment in asymptomatic people: role of traditional risk factors and noninvasive cardiovascular tests.
Philip Greenland ... Sidney C Smith Jr
Circulation | VOL. 104
Philip Greenland, et. al.Philip Greenland ... Sidney C Smith Jr
09 Oct 2001
Circulation | VOL. 104

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Application of ensemble machine learning algorithms on lifestyle factors and wearables for cardiovascular risk prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports