SVM Optimization with Correlation Feature Selection Based Binary Particle Swarm Optimization for Diagnosis of Chronic Kidney Disease

Doni Aprilianto

doi:10.52465/joscex.v1i1.1

Abstract

Data mining has been widely used to diagnose diseases from medical data. In this study using chronic kidney disease dataset taken from UCI Machine Learning. The dataset has 25 attributes with 400 samples. With 25 attributes that allow redundant data. Redundant data in datasets can reduce computational efficiency and classification accuracy. To increase accuracy of classification algorithm can be done by reducing dimensions of dataset. Correlation-based Feature Selection (CFS) can quickly identify and filter redundant attributes. However, CFS has disadvantage that selected attribute is not necessarily the best attribute. These weaknesses can be overcome by Binary Particle Swarm Optimization (BPSO). BPSO chooses attributes based on the best fitness value. The purpose of this study is to improve accuracy of Support Vector Machine (SVM) by implementing combination of CFS and BPSO as feature selection. Accuracy of SVM in predicting CKD is 63.75%. Whereas, accuracy of SVM by applying CFS as feature selection is 88.75% and average accuracy of ten execution SVM algorithms by applying a combination of CFS and BPSO as feature selection is 95%. Thus, combination of CFS and BPSO as feature selection on the SVM algorithm can improve results of accuracy in diagnosing CKD by 31.25%.

Full Text