Multi-label feature selection can select representative features to reduce the dimension of data. Since existing multi-label feature selection methods usually suppose that the significance of all labels is consistent, the relationships between samples in the entire label space are generated straightforwardly such that the shape of label distribution and the property of class-imbalance are ignored. To address these issues, we propose a novel multi-label feature selection approach. Based on non-negative matrix factorization (NMF), the similarities between the logical label and label distribution are constrained, which ensures that the shape of label distribution does not deviate from the underlying actual shape to some extent. Further, the relationships between samples in label space and feature space are restricted by graph embedding. Finally, we leverage the properties of label distribution and class-imbalance to generate the relationships between samples in label space and propose a multi-label feature selection approach based on fuzzy information entropy. Eight state-of-the-art methods are compared with the proposed method to validate the effectiveness of our method.
Read full abstract