Abstract
Feature selection is a crucial method for discovering relevant features in high‐dimensional data. However, most studies primarily focus on completely labeled data, ignoring the frequent occurrence of missing labels in real‐world problems. To address high‐dimensional and label‐missing problems in data classification simultaneously, we proposed a semisupervised bacterial heuristic feature selection algorithm. To track the label‐missing problem, a k‐nearest neighbor semisupervised learning strategy is designed to reconstruct missing labels. In addition, the bacterial heuristic algorithm is improved using hierarchical population initialization, dynamic learning, and elite population evolution strategies to enhance the search capacity for various feature combinations. To verify the effectiveness of the proposed algorithm, three groups of comparison experiments based on eight datasets are employed, including two traditional feature selection methods, four bacterial heuristic feature selection algorithms, and two swarm‐based heuristic feature selection algorithms. Experimental results demonstrate that the proposed algorithm has obvious advantages in terms of classification accuracy and selected feature numbers.
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have