Label correlations-based multi-label feature selection with label enhancement

Wenbin Qian,Yinsong Xiong,Weiping Ding,Jintao Huang,Chi-Man Vong

doi:10.1016/j.engappai.2023.107310

Wenbin Qian, Yinsong Xiong + Show 3 more

https://doi.org/10.1016/j.engappai.2023.107310

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Feature selection, as an important pre-processing technique, can efficiently mitigate the issue of “the curse of dimensionality” by selecting discriminative features especially for multi-label learning, a discriminative feature subset can improve the classification accuracy. The existing feature selection methods for multi-label classification address the problem of label ambiguity by with logical labels. However, the significance of each label is often different in many practical applications. Using logical label to train the model may result in unsatisfactory performance due to not considering the importance of related labels with each sample. To address this issue, a novel multi-label feature selection algorithm is proposed with two-step: label enhancement and label correlations-based feature selection with label enhancement. In the step of label enhancement, a framework of label enhancement based on deep forest is utilized to transform the logical label to label distribution, which contains rich semantic information and then guides a more correct exploration of semantic correlations. In the step of feature selection, a novel multi-label feature selection algorithm is proposed based on label distribution data. Firstly, the samples are divided into multiple different clusters by using spectral clustering in the label space. Then, the label correlations can be reflected by multiple different clusters. Finally, the l2,1-norm is used to construct an objective function to achieve multi-label feature selection. Experimental results demonstrate that competitiveness of the proposed algorithm over six state-of-the-art multi-label feature selection algorithms on eighteen benchmark datasets in terms of six widely accepted evaluation metrics.

Full Text