Multi-label feature selection based on the division of label topics

Ping Zhang,Wanfu Gao,Juncheng Hu,Yonghao Li

doi:10.1016/j.ins.2020.12.036

Abstract

Multi-label feature selection has attracted much attention from researchers and can reduce the high dimensionality of multi-label data. Previous multi-label methods consider the importance of labels equal, as a result, they choose the discriminative features based on the entire label set. In fact, there exists a latent semantic structure in the label set. Specifically, labels can be sorted into some central topics and some subordinate topics. Features regarding central topics should be chosen first and the number of them should be chosen more. To this end, we first explore the latent semantic structure according to spectral clustering. The labels are abstracted into several clusters named central clusters and subordinate clusters. Second, the importance of features with respect to the labels in each cluster is scored. Finally, we obtain the feature subset based on both the scores of features and the type of clusters. Comprehensive experiments demonstrate the superiority of the proposed method against seven state-of-the-art multi-label feature selection methods on fourteen benchmark multi-label data sets.

Full Text