Abstract

The techniques of data mining are used widely in the healthcare sector to predict and diagnose various diseases. Diagnosis of heart disease is considered as one of the very important applications of these systems. Data is being collected today in a large amount where people need to rely on the device. In recent years, heart disease has increased excessively and heart disease has become one of the deadliest diseases in many countries. Most data sets often suffer from extreme that reduce the accuracy percentage in classification. Extreme are defined in terms of irrelevant or incorrect data, missing values, and the incorrect of the dataset. Data conversion is another very important way to preconfigure the process of converting data into suitable mining models by acting assembly or assembly and filtering methods such as eliminating duplicate features by using the link and one of the wrap methods, and applying the repeated discrimination feature. This process is performed, dealing with lost through the Remove with values methods and methods of estimating the layer. Classification methods like Naive Bayes (NB) and Random Forest (RF) are applied to the original datasets and data sets with the feature of selection methods too. All of these operations are implemented on three various sets of heart disease data for the analysis of pre-treatment effect in terms of accuracy.

Highlights

  • Nowadays one of the major causes of death is heart disease at the present time

  • In Random Forest for prediction, the accuracy is very important like healthcare fields especially when we talk about heart disease, the processing times can be used as a differentiation, while time-sensitive fields seek rapid predictions like the accuracy of the disaster prediction ratio that can be used as a trade-off over time

  • We find that the Naive Bayes algorithm by using Wrapper method is appropriate for the classification of Hungarian, Switzerland, and Cleveland data sets of the Heart Disease Group

Read more

Summary

Introduction

Nowadays one of the major causes of death is heart disease at the present time. The heart disease prediction system can support healthcare specialists in predicting heart condition based on the clinical data of patients that has been pre-entered into the system. The datasets are collected and gathered from the Machine Learning Repository (UCI) It upholds 394 datasets copies with 14 attributes those names are sex, age, chest pain type, resting blood pressure, resting electrocardiographic results, fasting blood sugar>120 mg / dl, serum cholesterol in mg/dl, exercise induced angina, maximum heart rate achieved, the slope of the peak exercise ST segment, oldpeak = ST depression caused by exercise relative to rest, number of main vessels (0-3) colored by flourosopy, thal: 7 = reversible defect; 6 = fixed defect; 3 = normal. We summarize our work in conclusion section and future work

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.