Abstract

The heart disease has been one of the major causes of death worldwide. The heart disease diagnosis has been expensive nowadays, thus it is necessary to predict the risk of getting heart disease with selected features. The feature selection methods could be used as valuable techniques to reduce the cost of diagnosis by selecting the important attributes. The objectives of this study are to predict the classification model, and to know which selected features play a key role in the prediction of heart disease by using Cleveland and statlog project heart datasets. The accuracy of random forest algorithm both in classification and feature selection model has been observed to be 90–95% based on three different percentage splits. The 8 and 6 selected features seem to be the minimum feature requirements to build a better performance model. Whereby, further dropping of the 8 or 6 selected features may not lead to better performance for the prediction model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call