Abstract

We live in a period when smart devices gather a large amount of data from a variety of sensors and it is often the case that decisions are taken based on them in a more or less autonomous manner. Still, many of the inputs do not prove to be essential in the decision-making process; hence, it is of utmost importance to find the means of eliminating the noise and concentrating on the most influential attributes. In this sense, we put forward a method based on the swarm intelligence paradigm for extracting the most important features from several datasets. The thematic of this paper is a novel implementation of an algorithm from the swarm intelligence branch of the machine learning domain for improving feature selection. The combination of machine learning with the metaheuristic approaches has recently created a new branch of artificial intelligence called learnheuristics. This approach benefits both from the capability of feature selection to find the solutions that most impact on accuracy and performance, as well as the well known characteristic of swarm intelligence algorithms to efficiently comb through a large search space of solutions. The latter is used as a wrapper method in feature selection and the improvements are significant. In this paper, a modified version of the salp swarm algorithm for feature selection is proposed. This solution is verified by 21 datasets with the classification model of K-nearest neighborhoods. Furthermore, the performance of the algorithm is compared to the best algorithms with the same test setup resulting in better number of features and classification accuracy for the proposed solution. Therefore, the proposed method tackles feature selection and demonstrates its success with many benchmark datasets.

Highlights

  • The fields of big data, cryptography, and computer science in general are all influenced by the domain of optimization and to some extent even somewhat rely on it

  • Guided by established practice from the modern literature, before its application to feature selection, the proposed enhanced salp swarm algorithm (SSA) is firstly tested and evaluated on a recognized test-bed with challenging instances of functions having 30 dimensions from the Congress on Evolutionary Computation 2013 (CEC2013) benchmark suite [25]. This allows a direct comparison of the obtained results with the outputs of a large variety of state-ofthe-art (SOTA) metaheuristics. Afterwards, it is adapted as a wrapper-based approach for feature selection and validated against 21 well-known datasets retrieved from University of California, Irvine (UCI) repository [26]

  • The real world application of swarm intelligence solutions is vast from the clustering, node localization, and preserving of energy in wireless sensor networks [48,49,50,51], through to the scheduling problem with cloud tasks [2,52], the prediction of COVID-19 cases based on machine learning [53,54], MRI classification optimization [55,56], text document clustering [57], and the optimization of the artificial neural networks [58,59,60,61]

Read more

Summary

Introduction

The fields of big data, cryptography, and computer science in general are all influenced by the domain of optimization and to some extent even somewhat rely on it. Guided by established practice from the modern literature, before its application to feature selection, the proposed enhanced SSA is firstly tested and evaluated on a recognized test-bed with challenging instances of functions having 30 dimensions from the Congress on Evolutionary Computation 2013 (CEC2013) benchmark suite [25] This allows a direct comparison of the obtained results with the outputs of a large variety of state-ofthe-art (SOTA) metaheuristics. Proposed improved SSA algorithm overcomes some observed deficiencies and establishes better performance than original SSA; proposed method proves to be promising and competitive with other SOTA metaheuristics according to CEC2013 testing results; and compared to other SOTA approaches, improvements in addressing feature selection issue in machine learning in terms of classification accuracy and number of selected features is established.

Related Works
Proposed Method
Basic Salp Swarm Algorithm
Cons of the Original Algorithm and Proposed Improved Approach
Complexity and Limitations of Proposed Method
Validation of the Proposed Method for Standard CEC2013 Benchmarks
Objective
Feature Selection Experiments
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call