Abstract
Multi-label classification (MLC) tasks are encountered more and more frequently in machine learning applications. While MLC methods exist for the classical batch setting, only a few methods are available for streaming setting. In this paper, we propose a new methodology for MLC via multi-target regression in a streaming setting. Moreover, we develop a streaming multi-target regressor iSOUP-Tree that uses this approach. We experimentally compare two variants of the iSOUP-Tree method (building regression and model trees), as well as ensembles of iSOUP-Trees with state-of-the-art tree and ensemble methods for MLC on data streams. We evaluate these methods on a variety of measures of predictive performance (appropriate for the MLC task). The ensembles of iSOUP-Trees perform significantly better on some of these measures, especially the ones based on label ranking, and are not significantly worse than the competitors on any of the remaining measures. We identify the thresholding problem for the task of MLC on data streams as a key issue that needs to be addressed in order to obtain even better results in terms of predictive performance.
Highlights
The task of multi-label classification (MLC) has recently become very prominent in the machine learning research community (Gibaja and Ventura 2015)
We only have enough evidence to conclude that Hoeffding tree with pruned sets (HTPS) significantly outperforms model trees in terms of Precisionmacro
HTPS uses considerably less memory when compared to model and regression trees
Summary
The task of multi-label classification (MLC) has recently become very prominent in the machine learning research community (Gibaja and Ventura 2015) It can be seen as a generalization of the ubiquitous multi-class classification task, where instead of a single label, each example is associated with multiple labels. Generalizing multi-class classification, where only one of the possible labels needs to be predicted, multi-label classification requires a model to predict a combination (subset) of the possible labels This means that for each data instance x from an input space X a model needs to provide a prediction yfrom an output space Y , which is a powerset of the labelset L, i.e., Y = 2L. Binary relevance models have been often overlooked due to their inability to account for label correlations, though some BR methods are capable of modeling label correlations during classification
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.