Supervised feature selection (FS) as an interpretable dimensionality reduction technique has received increasing attention, where linear discriminative analysis (LDA)-based method can select informative features discriminatively and obtain promising performance. When original data has more features than samples, however, LDA-based method generally encounters degradation since the appearance of irreversible scatter matrix. This situation is called the small sample size (SSS) problem. To overcome it and enhance the discriminant power of selected feature subsets, in this paper, we design an elegant LDA-based FS model referred to as Top-k Discriminative FS (TDFS), which is constructed by seamlessly integrating the ℓ2,0-norm equation constraint into uncorrelated LDA model. More concretely, the ℓ2,0-norm equation constraint can explicitly characterize the number of selective features k to ensure the sparsity of projected matrix and select top features. The uncorrelated LDA model aims to improve discriminative ability based on uncorrelated data in projected subspace. Given the formidable nature of solving this non-convex model, a novel optimization algorithm is further developed and the SSS problem can be efficaciously addressed during the optimization process. We first decompose projection matrix into a discrete selection matrix and its corresponding nonzero projection matrix, then concurrently optimize above two matrices by employing a column-by-column update scheme, during which the reversibility of scatter matrix in selective feature subspace can be easily guaranteed to solve SSS problem. The extensive experiments on four synthetic data sets and eight real-world data sets show that the proposed method outperforms eight competitors validated by three classifiers. Moreover, although the theoretical analysis proves that our algorithm has quartic time complexity on the number of selected features k, the running time experiments verify that TDFS is still efficient and applicable in scenarios where only a small number of features need to be selected. From above perspectives, our algorithm shows desirable performance to achieve discriminative FS.
Read full abstract