Abstract

Feature selection is an important task in data mining and machine learning to reduce the dimensionality of data and improve the performance. However, feature selection is still a challenge task especially for the large-scale problems with small sample size and extremely large number of features. A variety of methods have been applied to solve the feature selection problems, in which evolutionary algorithm has recently attracted increasing attention and made great progress. In this study, a two-stage decomposition cooperating coevolution strategy for feature selection (CCFS/TD) is proposed. In the first stage, the proposed algorithm decomposes evolutionary process into k -level and evolves by cooperating coevolution for each level. Then, in the second stage, evolution process of each level is further decomposed into several independent processes. The selected subset of features are determined by the results of all independent processes through majority voting. Experiments on ten benchmark datasets are carried out to verify the effectiveness of the proposed method. The results demonstrate that the proposed CCFS/TD can obtain better classification performance with a smaller number of features in most cases in comparison to some existing methods.

Highlights

  • Feature selection (FS) plays a fundamental role in data mining and machine learning areas

  • EFFECT OF THE FIRST STAGE DECOMPOSITION In this subsection, we investigate the effectiveness of the first decomposition strategy on the reduction of feature dimension

  • The goal has been achieved by proposing a new two-stage decomposition based on cooperative coevolution strategy feature selection method

Read more

Summary

Introduction

Feature selection (FS) plays a fundamental role in data mining and machine learning areas. It is difficult to determine which features are useful without prior knowledge, and a large number of features are needed to be taken into consideration to the dataset, where there may exist noisy and useless features. The irrelevant and redundant features are useless to classification, and may reduce the classification performance. It is due to ‘‘the curse of dimensionality’’ caused by a large number of features, which is still a major challenge in classification [1]. Feature selection (See Fig.1) aims to select a subset of relevant features from the whole. By deleting irrelevant and redundant features, FS can reduce the number of features, speed up the learning process, simplify the learned model, and improve the classification performance [2]

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.