Abstract

Mutil-label classification is a machine learning task on a large number of labels where an instance may be associated with multiple class labels simultaneously. Although significant progress achieved, multi-label classification is still challenging due to the high-dimensional label space resulting from the emergence of multiple applications. To this end, dimensionality reduction originally for feature space is also applied to label space via exploiting label correlation information, deriving two kinds of techniques: label embedding and label selection. There have been many successful theories in the field of label embedding, but less attention has been paid to label selection. Infinite feature selection algorithm (Inf-FS) finds the most discriminative features subset. It treats feature subsets as paths in a graph, and by algebraic theory, the values of paths of arbitrary lengths can be evaluated. Letting the lengths of paths go to infinite allows to simplify the computational complexity of the selection process and considers the values of any path (subset) that contains a specific feature. Sorting features provides the feature subset to keep. We can apply this algorithm to label selection by treating label subsets as paths. After executing label selection, we need to design an operator to recover the original label space from the selected one. An effective classifier can be trained on the label subset. Then, we can propagate the predicted value for the selected label subset to the full label set in order to recover the original label space. We apply our model to five benchmark data sets with more than 100 labels. Experimental results show that our method achieves superior classification performance over other state-of-the-art methods, in terms of two performance evaluation metrics (precision and discounted gain@n) for high-dimensional label space.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call