Abstract

Feature selection can efficiently alleviate the issue of curse of dimensionality, especially for multi-label data with multiple features to embody diverse semantics. Although many supervised feature selection methods have been proposed, they generally assume the labels of training data are complete, whilst we only have data with incomplete labels in many real applications. Some methods try to select features with missing labels of training data, they still can not handle feature selection with a large and sparse label space. In addition, these approaches focus on global feature correlations, but some feature correlations are local and shared by a subset of data. In this paper, we introduce an approach called Feature Selection with missing labels based on Label Compression and Local feature Correlation (FSLCLC for short). FSLCLC adopts the low-rank matrix factorization on the sparse sample-label association matrix to compress labels and recover the missing labels in the compressed label space. In addition, it utilizes sparsity regularization and local feature correlation induced manifold regularizations to select the discriminative features. To solve the joint optimization objective for label compression, recovering missing labels and feature selection, we develop an iterative algorithm with guaranteed convergence. Experimental results on benchmark datasets show that the proposed FSLCLC outperforms the state-of-the-art multi-label feature selection algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call