Multi-Label Classification (MLC) assumes that each instance belongs to a set of labels, unlike traditional classification, where each instance corresponds to a unique value of a class variable. Calibrated Label Ranking (CLR) is an MLC algorithm that determines a ranking of labels for a given instance by considering a binary classifier for each pair of labels. In this way, it exploits pairwise label correlations. Furthermore, CLR alleviates the class-imbalance problem that usually arises in MLC because, in this domain, very few instances often belong to a label. In order to build the binary classifiers in CLR, it is required to employ a standard classification algorithm. The Decision Tree method C4.5 has been widely used in this field. In this research, we show that a version of C4.5 based on imprecise probabilities recently proposed, known as Credal C4.5, is more appropriate than C4.5 to handle the binary classification tasks in CLR. Experimental results reveal that Credal C4.5 outperforms C4.5 when both methods are used in CLR and that the difference is more statistically significant as the label noise level is higher.
Read full abstract