Effect of Feature Selection on the Accuracy of Machine Learning Model

Faculty Of Computer Science, Kabul University ,Asst Professor Mohammad Salim Hamdard,Asst Professor Hedayatullah Lodin

doi:10.47191/ijmra/v6-i9-66

Faculty Of Computer Science, Kabul University , Asst Professor Mohammad Salim Hamdard + Show 1 more

Open Access

https://doi.org/10.47191/ijmra/v6-i9-66

Copy DOI

Abstract

In real life data science problems, it’s almost rare that all the features in the dataset are useful for building a model. In machine learning, feature selection is the process of selecting a subset of relevant features or attributes for constructing a model. Removing irrelevant and redundant features and, selecting relevant features will improve the accuracy of a machine learning model. Furthermore, adding unnecessary variables to a model increases the overall complexity of the model. Our experiment indicates that the accuracy of a classification model is highly affected by the process of feature selection. We train three algorithms (K-Nearest Neighbors, Decision Tree, Multi-layer Perceptron) by selecting all the features and we got accuracies 49%, 84% and 71% accordingly. After doing some feature selection without any logical changes in models code the accuracy scores jumped to 82%, 86% and 78% accordingly which is quite impressive.

Full Text