Abstract

In real life data science problems, it’s almost rare that all the features in the dataset are useful for building a model. In machine learning, feature selection is the process of selecting a subset of relevant features or attributes for constructing a model. Removing irrelevant and redundant features and, selecting relevant features will improve the accuracy of a machine learning model. Furthermore, adding unnecessary variables to a model increases the overall complexity of the model. Our experiment indicates that the accuracy of a classification model is highly affected by the process of feature selection. We train three algorithms (K-Nearest Neighbors, Decision Tree, Multi-layer Perceptron) by selecting all the features and we got accuracies 49%, 84% and 71% accordingly. After doing some feature selection without any logical changes in models code the accuracy scores jumped to 82%, 86% and 78% accordingly which is quite impressive.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call