Abstract

Feature selection has been widely used in classification for improving classification accuracy and reducing computational complexity. Recently, evolutionary computation (EC) has become an important approach for solving feature selection problems. However, firstly, as the datasets processed by classifiers become increasingly large and complex, more and more irrelevant and redundant features may exist and there may be more local optima in the large-scale feature space. Therefore, traditional EC algorithms which have only one candidate solution generation strategy (CSGS) with fixed parameter values may not perform well in searching for the optimal feature subsets for large-scale feature selection problems. Secondly, many existing studies usually use only one classifier to evaluate feature subsets. To show the effectiveness of evolutionary algorithms for feature selection problems, more classifiers should be tested. Thus, in order to efficiently solve large-scale feature selection problems and to show whether the EC-based feature selection method is efficient for more classifiers, a self-adaptive parameter and strategy based particle swarm optimization (SPS-PSO) algorithm is proposed in this paper using multiple classifiers. In SPS-PSO, a representation scheme of solutions and five CSGSs have been used. To automatically adjust the CSGSs and their parameter values during the evolutionary process, a strategy self-adaptive mechanism and a parameter self-adaptive mechanism are employed in the framework of particle swarm optimization (PSO). By using the self-adaptive mechanisms, the SPS-PSO can adjust both CSGSs and their parameter values when solving different large-scale feature selection problems. Therefore, SPS-PSO has good global and local search ability when dealing with these large-scale problems. Moreover, four classifiers, i.e., k-nearest neighbor (KNN), linear discriminant analysis (LDA), extreme learning machine (ELM), and support vector machine (SVM), are individually used as the evaluation functions for testing the effectiveness of feature subsets generated by SPS-PSO. Nine datasets from the UCI Machine Learning Repository and Causality Workbench are used in the experiments. All the nine datasets have more than 600 dimensions, and two of them have more than 5,000 dimensions. The experimental results show that the strategy and parameter self-adaptive mechanisms can improve the performance of the evolutionary algorithms, and that SPS-PSO can achieve higher classification accuracy and obtain more concise solutions than those of the other algorithms on the large-scale feature problems selected in this research. In addition, feature selection can improve the classification accuracy and reduce computational time for various classifiers. Furthermore, KNN is a better surrogate model compared with the other classifiers used in these experiments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.