Abstract

Feature selection (FS) is an important data preprocessing technology for machine learning and data mining. Metaheuristic algorithm (MH) has been widely used in feature selection because of its powerful search function. This paper presents an improved Binary Dandelion Algorithm using Sine Cosine operator and Restart strategy (SCRBDA) for feature selection. First, the sine cosine operator is used in the radius formula of the core dandelions (CD), which significantly enhances the ability of algorithm development and exploration. Secondly, the algorithm uses a restart strategy to increase its ability to get rid of local optimum. Thirdly, mutual information is used to guide the generation of some dandelions, which pays more attention to the correlation between the selected features and categories. Finally, quick bit mutation is used as the mutation strategy to improve the diversity of the population. The SCRBDA proposed in this paper was tested on 18 datasets of different sizes from UCI machine learning database. The SCRBDA was compared with 8 other classical feature selection algorithms, and the performance of the proposed algorithm was evaluated through feature subset size, classification accuracy, fitness value, and F1-score. The experimental results show that SCRBDA achieves the best performance, which has stronger feature reduction ability and achieves better overall performance on most datasets. Especially on large-scale datasets, SCRBDA can obtain extremely smaller feature subsets while maintaining much higher classification accuracy, and satisfactory F1-score.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call