Abstract

In multi-label learning, high-dimensionality is the most prominent characteristic of the data. An efficient pre-processing step, named feature selection, is required to reduce “the curse of dimensionality” caused by irrelevant and redundant features in the high-dimensional feature space. However, the difference in significance of the related labels of an instance is ubiquitous in most practical applications. Motivated by that, in this paper, the label distribution learning is integrated into multi-label feature selection, which is proposed to mine the more supervised information ignored by equivalence relations in the label space of multi-label data. With the perspective of granular computing, a novel label enhancement algorithm is presented based on the fuzzy similarity relation, which utilizes the similarity between instances to explore the hidden label relevance and transform the logical label in multi-label data into a label distribution. Then, a label distribution feature selection algorithm is presented to measure the significance of features with the fuzzy mutual information framework. Moreover, on twelve publicly available multi-label datasets, the presented algorithm is compared with six state-of-the-art multi-label feature selection algorithms. As indicated in the experimental results, the presented algorithm achieves significant improvement over the extant algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call