Abstract

This paper investigates how does the solution representation in nature-inspired algorithms impact the performance of feature selection in classification problems. Four most suitable nature-inspired algorithms for feature selection were considered in the analysis, namely the Differential Evolution, Artificial Bee Colony, Particle Swarm Optimization, and Genetic Algorithm. The binary-coded and real-coded variants of the mentioned algorithms were compared for filter-based and wrapper-based feature selection methodologies on datasets commonly used by the research community. Additionally, the algorithms' performance on reducing the feature subset size regarding different solution representations was compared. Statistical tests were performed for discovering any significant differences in the algorithms' performances.

Highlights

  • The volume of data needed for classification has been increasing daily

  • This paper investigates the importance of solution representations in nature-inspired algorithms when applied for feature selection in classification problems

  • The purpose of the paper is to help these upcoming developers with the findings of the study, i.e., primarily, how to select the appropriate solution representation, which algorithm to select, and which classification method is the more suitable for their needs

Read more

Summary

Introduction

The volume of data needed for classification has been increasing daily. These data are said to be remarkable in both the number of data samples and the number of attributes/features within each sample, which represents a prominent problem for any learning algorithm either supervised or unsupervised [1]. The goal of feature selection is to select a subset of the most relevant features without incurring loss of information This algorithm can be found in many application areas which are relevant to intelligent and expert systems, such as data mining [3] and machine learning [4], [5], image processing [6], bioinformatics [7] etc. It is normally treated as a data preprocessing step similar to training a model for classification previously

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.