K-Means Cluster-based Random Undersampling and Meta-Learning Approach for Village Development Status Classification

Ahmad Ilham,Laelatul Khikmah,Safuan Safuan,Luqman Assaffat,Suprapedi Suprapedi

doi:10.30630/joiv.7.2.989

Ahmad Ilham, Laelatul Khikmah + Show 3 more

Open Access

PDF Available

https://doi.org/10.30630/joiv.7.2.989

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

There is a significant imbalanced class in the village development index (called IDM - Indeks Desa Membangun) dataset, marked by the number of self-supporting classes more than the disadvantaged class. The traditional classifiers are able to achieve high accuracy (ACC) by training all cases of the majority class but forsaking the minority class, so that possible for the classification results to be biased. In this study, a random under-sampling technique was employed based on k-means cluster (KMC) and a meta-learning approach to improving ACC of the village status classification model. Furthermore, the AdaBoost and Random Forest were used as meta technique and base learner, respectively. The proposed model has been evaluated using the area under the curve (AUC), and experimental results showed that it yielded excellent performance compared to the prior studies with the AUC, ACC, precision (PR), recall (RC), and g-mean (Gm) values of 95.50%, 95.52%, 95.5%, 95.5%, and 92.95%, respectively. Similarly, the result of the t-test also showed the proposed model yielded excellent performance compared to previous studies. It can be concluded that the AdaBoost algorithm improved misclassification and changed the distribution of data loss function in random forests. It indicates that the proposed model effectively deals with imbalanced classes in the village development status classification model.Â

Full Text