Abstract

Feature selection is an important task in big data analysis and information retrieval processing. It reduces the number of features by removing noise, extraneous data. In this paper, one feature subset selection algorithm based on damping oscillation theory and support vector machine classifier is proposed. This algorithm is called the Maximum Kendall coefficient Maximum Euclidean Distance Improved Gray Wolf Optimization algorithm (MKMDIGWO). In MKMDIGWO, first, a filter model based on Kendall coefficient and Euclidean distance is proposed, which is used to measure the correlation and redundancy of the candidate feature subset. Second, the wrapper model is an improved grey wolf optimization algorithm, in which its position update formula has been improved in order to achieve optimal results. Third, the filter model and the wrapper model are dynamically adjusted by the damping oscillation theory to achieve the effect of finding an optimal feature subset. Therefore, MKMDIGWO achieves both the efficiency of the filter model and the high precision of the wrapper model. Experimental results on five UCI public data sets and two microarray data sets have demonstrated the higher classification accuracy of the MKMDIGWO algorithm than that of other four state-of-the-art algorithms. The maximum ACC value of the MKMDIGWO algorithm is at least 0.5% higher than other algorithms on 10 data sets.

Highlights

  • In recent years, many irrelevant and redundant data have been discovered in data collection and sorting

  • This paper proposes a new feature selection algorithm named MKMDIGWO, which has two characteristics

  • The second is the combination of damping oscillation function adjustment filter algorithm and wrapper algorithm

Read more

Summary

Introduction

Many irrelevant and redundant data have been discovered in data collection and sorting. In order to eliminate the uniqueness of candidate feature subsets in the two-stage method, and to prevent the embedded method from falling into the local optimum, in this article, we employ the oscillation theory and combine the filter method and the wrapper method to propose a novel algorithm This algorithm is called the Maximum Kendall coefficient Maximum Euclidean Distance Improved Gray Wolf Optimization algorithm (MKMDIGWO). By comparing the maximum number of retention times with the multiple values of the damping oscillation, the filter method and the wrapper method alternately perform depth and breadth search on the data set, and find the optimal feature subset. In 2016, Emary et al proposed binary gray Wolf optimization algorithm to solve feature selection problem on the basis of continuous type [34].

Input Parameter: the number of feature
Experimental results and analysis
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call