Feature Selection Using Enhanced Particle Swarm Optimisation for Classification Models.

Hailun Xie,Li Zhang,Han Liu,Yonghong Yu,Chee Peng Lim

doi:10.3390/s21051816

Hailun Xie, Li Zhang + Show 3 more

Open Access

PDF Available

https://doi.org/10.3390/s21051816

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

In this research, we propose two Particle Swarm Optimisation (PSO) variants to undertake feature selection tasks. The aim is to overcome two major shortcomings of the original PSO model, i.e., premature convergence and weak exploitation around the near optimal solutions. The first proposed PSO variant incorporates four key operations, including a modified PSO operation with rectified personal and global best signals, spiral search based local exploitation, Gaussian distribution-based swarm leader enhancement, and mirroring and mutation operations for worst solution improvement. The second proposed PSO model enhances the first one through four new strategies, i.e., an adaptive exemplar breeding mechanism incorporating multiple optimal signals, nonlinear function oriented search coefficients, exponential and scattering schemes for swarm leader, and worst solution enhancement, respectively. In comparison with a set of 15 classical and advanced search methods, the proposed models illustrate statistical superiority for discriminative feature selection for a total of 13 data sets.

Highlights

The knowledge discovery processes in real-world applications often involve datasets with large numbers of features [1]
We propose two enhanced Particle Swarm Optimisation (PSO) models to address the identified limitations of the original PSO algorithm as well as undertake complex feature selection problems
We propose four new strategies in PSOVA2 to refine the transition between search diversity and swarm convergence, i.e., (1) an adaptive exemplar breeding mechanism incorporating multiple local and global best solutions, (2) search coefficient generation using sine, cosine, and hyperbolic tangent functions, (3) worst solution enhancement using a hybrid re-dispatching scheme, and (4) an exponential exploitation mechanism for swarm leader improvement

Summary

Introduction

The knowledge discovery processes in real-world applications often involve datasets with large numbers of features [1]. Feature selection and dimensionality reduction become critical in overcoming the aforementioned challenges by eliminating certain irrelevant and redundant features while identifying the most effective and discriminative ones [3,4]. For datasets with high dimensionalities, it is computationally impractical to conduct an exhaustive search of all possible combinations of the feature subsets [5]. The search landscape becomes extremely complicated, owing to the sophisticated confounding effects of various feature interactions in terms of redundancy and complementarity [6]. Effective and robust search methods are required to thoroughly explore the complex effects of feature interactions while satisfying the constraints of practicality in term of computational cost to undertake large-scale feature selection tasks

Results

Discussion

Conclusion