Abstract

Feature selection, as an important pre-processing technique, can efficiently mitigate the issue of “the curse of dimensionality” by selecting discriminative features especially for multi-label learning, a discriminative feature subset can improve the classification accuracy. The existing feature selection methods for multi-label classification address the problem of label ambiguity by with logical labels. However, the significance of each label is often different in many practical applications. Using logical label to train the model may result in unsatisfactory performance due to not considering the importance of related labels with each sample. To address this issue, a novel multi-label feature selection algorithm is proposed with two-step: label enhancement and label correlations-based feature selection with label enhancement. In the step of label enhancement, a framework of label enhancement based on deep forest is utilized to transform the logical label to label distribution, which contains rich semantic information and then guides a more correct exploration of semantic correlations. In the step of feature selection, a novel multi-label feature selection algorithm is proposed based on label distribution data. Firstly, the samples are divided into multiple different clusters by using spectral clustering in the label space. Then, the label correlations can be reflected by multiple different clusters. Finally, the l2,1-norm is used to construct an objective function to achieve multi-label feature selection. Experimental results demonstrate that competitiveness of the proposed algorithm over six state-of-the-art multi-label feature selection algorithms on eighteen benchmark datasets in terms of six widely accepted evaluation metrics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call