Comparison of Bagging and Adaboost Methods on C4.5 Algorithm for Stroke Prediction

Nur Diana Saputri,Dwi Rolliawati,Khalid Khalid

doi:10.32520/stmsi.v11i3.1684

Abstract

Stroke is a non-communicable disease and is very dangerous because of functional disorders of the brain caused by blockage of blood circulation. This disease is classified as a cerebrovascular disease because it requires treatment for 24 hours, if not treated quickly it can cause death. The purpose of this research is to overcome this problem is to create a machine learning-based prediction model for medical experts in dealing with diseases to help reduce the risk of death. The method applied for this research is to apply the C4.5 algorithm classification method as well as the bagging and Adaboost methods from Ensemble Learning. Stroke data is processed using 2 stages of data processing, namely the data cleaning stage and the data transformation stage. In this study, a comparison will be made between the C4.5 algorithm, the bagging method + the C4.5 algorithm and the Adaboost method + the C4.5 algorithm using the confusion matrix, k-fold cross validation and validation test based on the values of TP, TN, FP, FN, recall, precision, F1-Score and accuracy. The results of the classification test using the Confusion Matrix and k-fold cross validation for the C4.5 algorithm resulted in an accuracy of 92.87%. Then the accuracy of the C4.5 algorithm with the bagging method increased to 95.02% and when combined with the Adaboost method the accuracy value also increased to 94.63%. From these results, it can be said that a single classifier algorithm, namely the C4.5 algorithm with the bagging and Adaboost methods, has been proven to improve classification performance.

Full Text