Abstract

Reducing the number of selected features and improving the classification performance are two major objectives in feature selection, which can be viewed as a multi-objective optimization problem. Multi-objective feature selection in classification has its unique characteristics, such as it has a strong preference for the classification performance over the number of selected features. Besides, solution duplication often appears in both the search and the objective spaces, which degenerates the diversity and results in the premature convergence of the population. To deal with the above issues, in this paper, during the evolutionary training process, a multi-objective feature selection problem is reformulated and solved as a constrained multi-objective optimization problem, which adds a constraint on the classification performance for each solution (e.g., feature subset) according to the distribution of nondominated solutions, with the aim of selecting promising feature subsets that contain more informative and strongly relevant features, which are beneficial to improve the classification performance. Furthermore, based on the distribution of feature subsets in the objective space and their similarity in the search space, a duplication analysis and handling method is proposed to enhance the diversity of the population. Experimental results demonstrate that the proposed method outperforms six state-of-the-art algorithms and is computationally efficient on 18 classification datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call