Liver Disease Classification Using the Elbow Method to Determine Optimal K in the K-Nearest Neighbor (K-NN) Algorithm

Ihya' Nashirudin Abrar,Asrul Abdullah,Sucipto Sucipto

doi:10.32736/sisfokom.v12i2.1643

Ihya' Nashirudin Abrar, Asrul Abdullah + Show 1 more

Open Access

https://doi.org/10.32736/sisfokom.v12i2.1643

Copy DOI

Abstract

Diagnosing liver disease in the field of healthcare is not an easy task. However, by utilizing medical records as datasets and applying data mining methods such as K-Nearest Neighbor (K-NN), we can analyze and extract knowledge automatically. The K-NN method has proven to be more effective compared to other methods as it clusters new information by selecting the nearest neighbors based on the value of k. In this study, we employed the Elbow method to determine the optimal value of k by measuring the error rate. The test results revealed that the optimal value of k was found to be 4, with the lowest error rate. In the third test, we achieved a training accuracy of 80.5% and a testing accuracy of 78.9%. After fine-tuning the training data, we were able to improve the accuracy to 82.2% for training and 77.1% for testing. However, in the fourth test, we encountered overfitting issues due to data imbalance caused by inappropriate resampling, resulting in a model that was overly complex and prone to excessive noise.

Full Text