Heart disease is a significant public health issue and the leading cause of death worldwide. Risk factors such as hypertension, diabetes, obesity, sedentary lifestyle, smoking, and genetic factors contribute to the development of heart disease. This study aims to develop a heart disease prediction model using the Random Forest method. The dataset used comes from the UCI Machine Learning Repository, containing data from 1026 patients with various health features. The methods used include the stages of knowledge discovery in databases (KDD), namely data selection, preprocessing, transformation, data mining, and evaluation. The study results show that the model with 100 decision trees achieved an accuracy of 0.9823. Further evaluation using the confusion matrix and classification report indicates that the Random Forest method provides 98% accuracy, 100% precision, 96% recall, and a 98% F1-score. In conclusion, the Random Forest method is effective in predicting heart disease, with features such as thal having a significant impact on model accuracy.
Read full abstract