Abstract

Feature selection is an essential step in the preprocessing of data in pattern recognition and data mining. Nowadays, the feature selection problem as an optimization problem can be solved with nature-inspired algorithm. In this paper, we propose an efficient feature selection method based on the cuckoo search algorithm called CBCSEM. The proposed method avoids the premature convergence of traditional methods and the tendency to fall into local optima, and this efficient method is attributed to three aspects. Firstly, the chaotic map increases the diversity of the initialization of the algorithm and lays the foundation for its convergence. Then, the proposed two-population elite preservation strategy can find the attractive one of each generation and preserve it. Finally, Lévy flight is developed to update the position of a cuckoo, and the proposed uniform mutation strategy avoids the trouble that the search space is too large for the convergence of the algorithm due to Lévy flight and improves the algorithm exploitation ability. The experimental results on several real UCI datasets show that the proposed method is competitive in comparison with other feature selection algorithms.

Highlights

  • Data processing and data mining has become a significant area of research for academics, and how to process data is a very complex and challenging task

  • The use of information processing methods such as information gain, information entropy, Pareto analysis, T-tests, and mutual information has been used to solve feature selection problems [3,4,5], where the principle is to use correlation between features and attributes to select the subset of features with the strongest correlation. e wrapper approach consists of two phases: the feature selection phase, and the other is the training phase of the machine learning classifier. e selection of feature subsets depends on the classification algorithm, and the feature subsets that are selected have a higher accuracy of the classifier

  • To address the abovementioned algorithms’ shortcomings, we propose a new multistrategy integration cuckoo search algorithm to improve the performance of the cuckoo algorithm in solving the feature selection problem in this paper. e main contributions are as follows: (1) A new feature selection method is proposed based on the cuckoo search algorithm (CBCSEM). e population is initialized using different chaotic maps to ensure the diversity of the population, and uniformly initialized individuals form the basis for the convergence of the algorithm

Read more

Summary

Introduction

Data processing and data mining has become a significant area of research for academics, and how to process data is a very complex and challenging task. E feature selection problem is a very challenging task in the field of machine learning. In machine learning classification tasks, it is necessary to select a subset of features that can make the classifier accurate and efficient at the same time. The use of information processing methods such as information gain, information entropy, Pareto analysis, T-tests, and mutual information has been used to solve feature selection problems [3,4,5], where the principle is to use correlation between features and attributes to select the subset of features with the strongest correlation. E wrapper approach consists of two phases: the feature selection phase, and the other is the training phase of the machine learning classifier. The embedded approach tries to combine the abovementioned

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.