Abstract

The task of multi-target regression (MTR) is concerned with learning predictive models capable of predicting multiple target variables simultaneously. MTR has attracted an increasing attention within research community in recent years, yielding a variety of methods. The methods can be divided into two main groups: problem transformation and problem adaptation. The former transform a MTR problem into simpler (typically single target) problems and apply known approaches, while the latter adapt the learning methods to directly handle the multiple target variables and learn better models which simultaneously predict all of the targets. Studies have identified the latter group of methods as having competitive advantage over the former, probably due to the fact that it exploits the interrelations of the multiple targets. In the related task of multi-label classification, it has been recently shown that organizing the multiple labels into a hierarchical structure can improve predictive performance. In this paper, we investigate whether organizing the targets into a hierarchical structure can improve the performance for MTR problems. More precisely, we propose to structure the multiple target variables into a hierarchy of variables, thus translating the task of MTR into a task of hierarchical multi-target regression (HMTR). We use four data-driven methods for devising the hierarchical structure that cluster the real values of the targets or the feature importance scores with respect to the targets. The evaluation of the proposed methodology on 16 benchmark MTR datasets reveals that structuring the multiple target variables into a hierarchy improves the predictive performance of the corresponding MTR models. The results also show that data-driven methods produce hierarchies that can improve the predictive performance even more than expert constructed hierarchies. Finally, the improvement in predictive performance is more pronounced for the datasets with very large numbers (more than hundred) of targets.

Highlights

  • In supervised learning, the main goal is to learn, from a set of examples with known output values, a function predicting the target value of a previously unseen example

  • predictive clustering trees (PCTs)-PCT-FR refers to the clustering method with PCTs of the output space consisting of feature rankings using single PCTs for building the model, etc

  • In the average rank diagrams for the clustering methods over the feature ranking space, we can see that PCT-balanced k-means clustering (BkM)-FR is the best performing method and it is significantly better than all others

Read more

Summary

INTRODUCTION

The main goal is to learn, from a set of examples with known output (target) values, a function predicting the target value of a previously unseen example. S. Nikoloski et al.: Data-Driven Structuring of the Output Space Improves the Performance of Multi-Target Regressors continuous/numeric variables the task at hand is multi-target regression (MTR). We selected PCTs since they are global models that can be used for different structured output prediction tasks (including MTR and HMTR) and they are constructed very efficiently. They are able to make a predictions for several types of structured outputs such as tuples of numerical/discrete values, time series, and hierarchies of variables. The results from the evaluation reveal that better predictive performance can be achieved by using data-driven approaches to construct the hierarchies rather than considering either, the flat multi-target regression task, or the pre-defined hierarchy created by a domain expert.

BACKGROUND
EVALUATION MEASURES
PARAMETER INSTANTIATION
RESULTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.