Implementasi XGBoost Pada Keseimbangan Liver Patient Dataset dengan SMOTE dan Hyperparameter Tuning Bayesian Search

Rahmad Ubaidillah,Muliadi Muliadi,Dodon Turianto Nugrahadi,Rudy Herteno,M Reza Faisal

doi:10.30865/mib.v6i3.4146

Abstract

Liver disease is a disorder of liver function caused by infection with viruses, bacteria or other toxic substances so that the liver cannot function properly. This liver disease needs to be diagnosed early using a classification algorithm. By using the Indian liver patient dataset, predictions can be made using a classification algorithm to determine whether or not patients have liver disease. However, this dataset has a problem where there is an imbalance of data between patients with liver disease and those without, so it can reduce the performance of the prediction model because it tends to produce non-specific predictions. In this study, classification uses the XGBoost method which is then added with SMOTE to overcome class imbalances in the dataset and/or combined with Bayesian search hyperparameter tuning so that the resulting model performance is better. From the research, the results obtained from the XGBoost model get an AUC value of 0.618, for the XGBoost model with Bayesian search the AUC value is 0.658, then for the XGBoost SMOTE model the AUC value is 0.716, then for the XGBoost SMOTE model with Bayesian search the AUC value is 0.767. From the comparison of the four models, XGBoost SMOTE with Bayesian search obtained the highest AUC results and has an AUC difference of 0.149 compared to the XGBoost model without SMOTE and Bayesian search.

Full Text