Abstract

Coronary Artery Disease (CAD) is very common among the major types of cardiovascular diseases, and there are several studies created with different features including data that is collected from patients for timely diagnosis of CAD.In this study, a dataset with 21 features have been used, and a risk score prediction system has been proposed. The patients were divided into four groups. To determine the effective features of CAD dataset; t-test and Relief-f methods on Logistic Regression Analysis (LRA); Relief-f on Neural Network (NN) feature selection methods were utilized.Sampling methods were used to improve imbalanced form of 4-classed dataset, and the effects of sampling methods were evaluated. Using NN with oversampling and Relief-f feature selection method; the results before the preprocess operations were detected as follows; 72.3% accuracy; after the operations, 84.1% accuracy were achieved with 0.84 sensitivity 0.94 specificity. These statistics obtained from the experiment, by detailed analysis, are the best ones for the CAD data set in this study.Using the feature selection and the sampling methods with the NN substantially improve the prediction accuracy as well as the other metrics. This suggests that these preprocessing methods and the NN may be used together to construct for prediction of the 4-classed imbalanced medical datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call