Abstract

 
 
 Traditional classification algorithms consider learning problems that contain only one label, i.e., each example is associated with one single nominal target variable characterizing its property. However, the number of practical applications involving data with multiple target variables has increased. To learn from this sort of data, multi-label classification algorithms should be used. The task of learning from multi-label data can be addressed by methods that transform the multi-label classification problem into several single-label classification problems. In this work, two well known methods based on this approach are used, as well as a third method we propose to overcome some deficiencies of one of them, in a case study using textual data related to medical findings, which were structured using the bag-of-words approach. The experimental study using these three methods shows an improvement on the results obtained by our proposed multi-label classification method.
 
 
Highlights
Traditional single-label classification methods are concerned with learning from a set of examples that are associated with a single label y from a set of disjoint labels L, |L| > 1 [9, 1]
Label Power Set (LP) takes into account label dependency, when a large or even moderate number of labels are considered, the task of multi-class learning the label power sets would become rather challenging due to the tremendous number of possible label sets
Binary Relevance (BR)+ was implemented using Mulan3, a package of Java classes for multi-label classification based on Weka4, a collection of machine learning algorithms for data mining tasks implemented in Java
Summary
Traditional single-label classification methods are concerned with learning from a set of examples that are associated with a single label y from a set of disjoint labels L, |L| > 1 [9, 1]. The multi-label problem can be transformed into one multi-class single-label learning problem, using as target values for the class attribute all unique existing subsets of multi-labels present in the training instances (the distinct subsets of labels) This method is called Label Power Set (LP ). Among labels are mapped directly from the data, since all the existing combinations of single-labels present in the training instances are used as a possible label in the correspondent multi-class single-label classification problem. In this context, the Binary Relevance method has been strongly criticized due to its incapacity of handling label dependency information [10].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.