Backward Elimination for Feature Selection on Breast Cancer Classification Using Logistic Regression and Support Vector Machine Algorithms

Salsha Farahdiba,Triando Hamonangan Saragih,Dwi Kartini,Rudy Herteno,Radityo Adi Nugroho

doi:10.22146/ijccs.88926

Abstract

Breast cancer is a prevalent form of cancer that afflicts women across all nations globally. One of the ways that can be done as a prevention to reduce elevated fatality due to breast cancer is with a detection system that can determine whether a cancer is benign or malignant. Logistic Regression and Support Vector Machine (SVM) classification algorithms are often used to detect this disease, but the use of these two algorithms often doesn’t give optimal results when applied to datasets that have many features, so additional algorithm is needed to improve classification performance by using Backward Elimination feature selection. The comparison of Logistic Regression and SVM algorithms was carried out by applying feature selection to breast cancer data to see the best model. The breast cancer dataset has 30 features and two classes, Benign and Malignant. Backward Elimination has reduced features from 30 features to 13 features, thereby increasing the performance of both classification models. The best classification was obtained by using the Backward Elimination feature selection and linear kernel SVM with an increase in accuracy value from 96.14% to 97.02%, precision from 98.06% to 99.49%, recall from 90.48% to 92.38%, and the AUC from 0.95 to 0.96.

Full Text