Cardiovascular risk assessment using data mining inferencing and feature engineering techniques

Aanchal Sahu,Manjusha Pandey,Siddharth Swarup Rautaray,Harshvardhan Gm,Mahendra Kumar Gourisaria

doi:10.1007/s41870-021-00650-w

Abstract

With the frequent decline in people’s health due to the hectic lifestyle, increased levels of workload and intake of fast food, there has been an unfortunate growth in the number of patients suffering from cardiovascular diseases each year. Around the world, millions of people die each year due to cardiovascular diseases. While the statistics are eye-opening, with the vast amount of data about heart patients in our hands, we can save millions by detecting the risk at an early stage. With the recent advances in soft computing and fuzzy logic, various algorithmic approaches are employed to tackle the issue of cardiovascular risk assessment through machine learning. Using some of the algorithms of machine learning like Logistic Regression (LR), Naive Bayes (NB), Support vector machine (SVM), and Decision tree (DT), Random Forest (RF) and K-Nearest Neighbours (KNN) classifiers, a model can be built to predict the risk accurately. In this paper, we have analysed each of the above methods normally and through feature engineering techniques like transformation through Principal Component Axes and considering different train-test folds to find the best performing model, which was found to be KNN in terms of all metrics and Logistic Regression in terms of accuracy.

Full Text