Abstract

Feature selection aims to reduce the number of features and improve the classification accuracy, which is an essential step in many real-world problems. Multiple feature subsets with different features selected can achieve similar or even the same objective values (e.g., maximize the classification accuracy and minimize the number of selected features). This means the optimal feature subsets of a classification problem may not be unique. However, most existing feature selection methods do not take into consideration finding multiple optimal feature subsets. In this article, a multiobjective differential evolution approach is developed to search for multiple optimal feature subsets. The contributions are three-fold. First, to provide a good starting point, an initialization method considering feature relevance is proposed. Second, a clustering method is used to divide the whole population into multiple subpopulations. In each of these subpopulations, a subarchive utilizes a developed crowding distance to ensure diversity by considering both the search space and the objective space. Finally, the nondominated solutions from all the subarchives are retained in another archive to guide the evolutionary feature selection process, together with an improved hypervolume contribution indicator. The experiments on 14 datasets of varying difficulty show that the proposed approach can evolve a better Pareto front of feature subsets compared with seven other state-of-the-art methods as well as find different feature subsets with similar or the same classification performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call