Abstract

Feature selection and construction are important pre-processing techniques in data mining. They may allow not only dimensionality reduction but also classifier accuracy and efficiency improvement. These two techniques are of great importance especially for the case of high-dimensional data. Feature construction for high-dimensional data is still a very challenging topic. This can be explained by the large search space of feature combinations, whose size is a function of the number of features. Recently, researchers have used Genetic Programming (GP) for feature construction and the obtained results were promising. Unfortunately, the wrapper evaluation of each feature subset, where a feature can be constructed by a combination of features, is computationally intensive since such evaluation requires running the classifier on the data sets. Motivated by this observation, we propose, in this paper, a hybrid multiobjective evolutionary approach for efficient feature construction and selection. Our approach uses two filter objectives and one wrapper objective corresponding to the accuracy. In fact, the whole population is evaluated using two filter objectives. However, only non-dominated (best) feature subsets are improved using an indicator-based local search that optimizes the three objectives simultaneously. Our approach has been assessed on six high-dimensional datasets and compared with two existing prominent GP approaches, using three different classifiers for accuracy evaluation. Based on the obtained results, our approach is shown to provide competitive and better results compared with two competitor GP algorithms tested in this study.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call