Abstract

Supervised feature selection (FS) as an interpretable dimensionality reduction technique has received increasing attention, where linear discriminative analysis (LDA)-based method can select informative features discriminatively and obtain promising performance. When original data has more features than samples, however, LDA-based method generally encounters degradation since the appearance of irreversible scatter matrix. This situation is called the small sample size (SSS) problem. To overcome it and enhance the discriminant power of selected feature subsets, in this paper, we design an elegant LDA-based FS model referred to as Top-k Discriminative FS (TDFS), which is constructed by seamlessly integrating the ℓ2,0-norm equation constraint into uncorrelated LDA model. More concretely, the ℓ2,0-norm equation constraint can explicitly characterize the number of selective features k to ensure the sparsity of projected matrix and select top features. The uncorrelated LDA model aims to improve discriminative ability based on uncorrelated data in projected subspace. Given the formidable nature of solving this non-convex model, a novel optimization algorithm is further developed and the SSS problem can be efficaciously addressed during the optimization process. We first decompose projection matrix into a discrete selection matrix and its corresponding nonzero projection matrix, then concurrently optimize above two matrices by employing a column-by-column update scheme, during which the reversibility of scatter matrix in selective feature subspace can be easily guaranteed to solve SSS problem. The extensive experiments on four synthetic data sets and eight real-world data sets show that the proposed method outperforms eight competitors validated by three classifiers. Moreover, although the theoretical analysis proves that our algorithm has quartic time complexity on the number of selected features k, the running time experiments verify that TDFS is still efficient and applicable in scenarios where only a small number of features need to be selected. From above perspectives, our algorithm shows desirable performance to achieve discriminative FS.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.