Abstract

Multi-label data classification has become an important and active research topic, where the classification algorithm is required to deal with prediction of sets of label indicators for instances simultaneously. Label powerset (LP) method reduces the multi-label classification problem to a single-label multi-class classification problem by treating each distinct combination of labels. However, the predictive performance of LP is challenged with imbalanced distribution among the labelsets, deteriorating the performance of traditional classifiers. In this paper, we study the problem of multi-label imbalanced data classification and propose a novel solution, called CSRankSVM (Cost sensitive Ranking Support Vector Machine), which assigns a different misclassification cost for each labelset to effectively tackle the problem of imbalance for Multi-label data. Empirical studies on popular benchmark datasets with various imbalance ratios of labelsets demonstrate that the proposed CSRankSVM approach can effectively boost classification performances in multi-label datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.