Abstract

Owing to the high dimensionality of multilabel data, feature selection in multilabel learning will be necessary in order to reduce the redundant features and improve the performance of multilabel classification. Rough set theory, as a valid mathematical tool for data analysis, has been widely applied to feature selection (also called attribute reduction). In this study, we propose a variable precision attribute reduct for multilabel data based on rough set theory, called δ-confidence reduct, which can correctly capture the uncertainty implied among labels. Furthermore, judgement theory and discernibility matrix associated with δ-confidence reduct are also introduced, from which we can obtain the approach to knowledge reduction in multilabel decision tables.

Highlights

  • Conventional supervised learning deals with the single-label data, where each instance is associated with a single class label

  • In many real-world tasks, one instance may simultaneously belong to multiple class lultilabel decision tablabels, for example, in text categorization problems, where every document may be labeled as several predefined topics, such as religion and political topics [1]; in image annotation problems, a photograph may be associated with more than one tag, such as elephant, jungle, and Africa [2]; in functional genomics, each gene may be related to a set of functional classes, such as metabolism, transcription, and protein synthesis [3]

  • Owing to the high dimensionality of multilabel data, feature selection in multilabel learning will be necessary in order to reduce the redundant features and improve the performance of multilabel classification

Read more

Summary

Introduction

Conventional supervised learning deals with the single-label data, where each instance is associated with a single class label. In many real-world tasks, one instance may simultaneously belong to multiple class lultilabel decision tablabels, for example, in text categorization problems, where every document may be labeled as several predefined topics, such as religion and political topics [1]; in image annotation problems, a photograph may be associated with more than one tag, such as elephant, jungle, and Africa [2]; in functional genomics, each gene may be related to a set of functional classes, such as metabolism, transcription, and protein synthesis [3] Judgement theory and discernibility matrix associated with δconfidence reduct are established These results provide approaches to knowledge reduction for multilabel data, which are significant in both the theoretic and applied perspectives.

Preliminaries
The Multilabel Data
The New Attribute Reduction Approach in Multilabel Data Decision Tables
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call