Abstract

In multilabel learning, the curse of dimensionality is one of major challenges. Existing single-label feature selection methods cannot be directly applied to multilabel data, and multilabel feature selections have thus been widely studied. As an effective granular computing tool, rough set theory has been applied to multilabel feature selections for addressing various realistic applications. However, existing rough set-based methods not only cannot effectively characterize the ability of features to distinguish multilabel sample pairs, but also usually own high time complexity. In this article, we propose two novel multilabel feature selection methods from the perspective of discerning sample pairs. First, relative discernibility pair matrixes of features are defined in the framework of fuzzy rough set, where each element represents the degree of distinguishing the corresponding sample pair by features. On this basis, a novel evaluation measure of feature subsets is defined. Afterwards, a heuristic multilabel feature selection approach titled RDPM based on the proposed measure is put forward. Inspired by sampling and ensemble strategies, another efficient and robust multilabel feature selection approach titled RDPM_SE is proposed as well. Finally, extensive experiments on 13 real-world multilabel datasets are conducted, and experimental results show that the proposed algorithms outperform seven state-of-the-art methods in terms of performances and the running time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call