Feature selection (FS), as one of the most significant preprocessing techniques in the fields of machine learning and pattern recognition, has received great attention. In recent years, evolutionary computation has become a popular technique for handling FS problems due to its superior global search performance. In this paper, a comprehensive review of evolutionary computation research on the FS problems is presented. Firstly, a new taxonomy for the basic components of evolutionary feature selection algorithms (EFSs) is proposed, including encoding strategy, population initialization, population updating, local search, multi-FS hybrid and ensemble. Secondly, we summarize the latest research progress of EFSs on some new and complex scenarios, including large-scale high-dimensional data, multi-objective/metric scenario, multi-label data, distributed storage data, multi-task scenario, multi-modal scenario, interpretable FS and stable FS, etc. Moreover, this survey provides also an in-depth analysis of real-world applications of EFSs, such as hyperspectral band selection, bioinformatics gene selection, text classification and fault detection, etc. Finally, several opportunities for future work are pointed out.
Read full abstract