Feature selection, as a dimension reduction technique in data mining and pattern recognition, aims to select the most discriminative features and improve the learning performance. With an abundance of unlabeled data readily available across various applications, semi-supervised feature selection has emerged as a promising approach. While most semi-supervised feature selection methods rely on simple graphs to preserve the geometrical structure of data, this approach often fails in capturing the high-order relationships present in many real-world applications. In contrast, hypergraphs offer the ability to encode more complex structures of data beyond what a simple graph can achieve. In this paper, we propose a feature selection method formulated in the trace ratio form, integrating hypergraph Laplacian-based semi-supervised discriminant analysis (SDA) and the mixed convex and non-convex ℓ2,p-norm (0<p≤1) regularization. The proposed trace ratio-based method, called HSDAFS, leverages the discriminative information from labeled data to maximize class separability while also utilizing the hypergraph Laplacian to capture the geometrical structure and high-order relationships within both labeled and unlabeled data. The ℓ2,p-norm regularization in the proposed HSDAFS provides improved sparsity over the ℓ2,1-norm. It ensures that the projection matrix is row-sparse, enabling the effective joint selection of discriminative features across all data. To solve the trace ratio-based HSDAFS method, we convert it into a trace difference method and propose an iterative algorithm. Experiments on several datasets demonstrate that HSDAFS is more effective in selecting the most discriminative features compared to other methods.
Read full abstract