Abstract

BackgroundThe use of Cardiovascular Disease (CVD) risk estimation scores in primary prevention has long been established. However, their performance still remains a matter of concern. The aim of this study was to explore the potential of using ML methodologies on CVD prediction, especially compared to established risk tool, the HellenicSCORE.MethodsData from the ATTICA prospective study (n = 2020 adults), enrolled during 2001–02 and followed-up in 2011–12 were used. Three different machine-learning classifiers (k-NN, random forest, and decision tree) were trained and evaluated against 10-year CVD incidence, in comparison with the HellenicSCORE tool (a calibration of the ESC SCORE). Training datasets, consisting from 16 variables to only 5 variables, were chosen, with or without bootstrapping, in an attempt to achieve the best overall performance for the machine learning classifiers.ResultsDepending on the classifier and the training dataset the outcome varied in efficiency but was comparable between the two methodological approaches. In particular, the HellenicSCORE showed accuracy 85%, specificity 20%, sensitivity 97%, positive predictive value 87%, and negative predictive value 58%, whereas for the machine learning methodologies, accuracy ranged from 65 to 84%, specificity from 46 to 56%, sensitivity from 67 to 89%, positive predictive value from 89 to 91%, and negative predictive value from 24 to 45%; random forest gave the best results, while the k-NN gave the poorest results.ConclusionsThe alternative approach of machine learning classification produced results comparable to that of risk prediction scores and, thus, it can be used as a method of CVD prediction, taking into consideration the advantages that machine learning methodologies may offer.

Highlights

  • The use of Cardiovascular Disease (CVD) risk estimation scores in primary prevention has long been established

  • All the possible combinations of the five different comparisons mentioned in the Validation subsection and the three different machine learning (ML) classifiers were performed, in order to evaluate the performance of the ML

  • The specific comparison certified that the ML techniques - especially Random forest (RF) and Decision tree (DT) - had comparable efficiency one another and superiority to that of HellenicSCORE

Read more

Summary

Introduction

The use of Cardiovascular Disease (CVD) risk estimation scores in primary prevention has long been established. One of the pioneer countries was Greece which has recalibrated the European Society of Cardiology (ESC) SCORE into the HellenicSCORE by considering the prevalence of CVD risk factors in the Greek population [11] It should be noted here - for the reader who is not familiar with CVD risk prediction scores - that there is a variety of CVD risk prediction tools, from different countries and populations, with different set of risk factors used and with a large variation regarding their performance. The majority of these scores use a common set of the “classical” CVD risk factors, e.g., age, sex, smoking, blood pressure and lipids levels, whereas others have incorporated more advanced markers of CVD disease. Despite the aforementioned approaches to early identify the potential CVD candidate through risk prediction tools, a high percent of CVD events occurs in people without established risk factors, or with low-tomoderate overall risk, whereas, approximately 20% of high-risk individuals, remain underestimated due to risk misclassification, suggesting the need to identify new methodologies that could optimize the performance of risk prediction [13,14,15,16]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call