Abstract

Abstract Feature selection, a combinatorial optimization problem, remains broadly applied in the area of Computational Learning with the aim to construct a model with reduced features so as to improve the performance of the model. Feature selection algorithm aims to identify admissible subgroup of features without sacrificing the accuracy of the model. This research works uses Improved Binary Particle Swarm Optimization (IBPSO) to optimally identify subset of features. The problem of stagnation, trapping in local optima and premature convergence of Binary Particle Swarm Optimization (BPSO) for solving discrete feature selection dispute has been tackled using IBPSO. IBPSO prevents the model from overfitting and also takes less computational time for constructing the model because of reduced feature subset. The sine function, cosine function, position of the random particle and linear decrement of inertial weight are integrated in IBPSO, which balances between exploration and exploitation to identify optimal subset of features. The linear decrement of inertial weight tends to do good level of exploration at the starting phase, whereas at the end it tends to exploit solution space to find the optimal subset of features that are more informative and thereby discarding redundant and irrelevant features. Experimentation is carried out on seven benchmarking datasets obtained from University of California, Irvine repository, which includes various real-world datasets for processing with machine learning algorithms. The proposed IBPSO is compared with conventional metaheuristic algorithms such as BPSO, Simulated Annealing, Ant Colony Optimization, Genetic Algorithm and other hybrid metaheuristic feature selection algorithms. The result proves that IBPSO maximizes the accuracy of the classifier together with maximum dimensionality reduction ratio. Also, statistical tests such as T-test, Wilcoxon signed-pair test are also carried out to demonstrate IBPSO is better than other algorithms taken for experimentation with confidence level of 0.05.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call