Abstract

We introduce a novel boosting algorithm called ‘KTBoost’ which combines kernel boosting and tree boosting. In each boosting iteration, the algorithm adds either a regression tree or reproducing kernel Hilbert space (RKHS) regression function to the ensemble of base learners. Intuitively, the idea is that discontinuous trees and continuous RKHS regression functions complement each other, and that this combination allows for better learning of functions that have parts with varying degrees of regularity such as discontinuities and smooth parts. We empirically show that KTBoost significantly outperforms both tree and kernel boosting in terms of predictive accuracy in a comparison on a wide array of data sets.

Highlights

  • Boosting algorithms [8,15,17,18,28] enjoy large popularity in both applied data science and machine learning research, among other things, due to their high predictive accuracy observed on a wide range of data sets [11]

  • reproducing kernel Hilbert space (RKHS) regression is a form of non-parametric regression which shows state-of-the-art predictive accuracy for many data sets as it can, for instance, achieve near-optimal test errors [1,2], and kernel classifiers parallel the behaviors of deep networks as noted in Zhang et al [46]

  • To briefly illustrate that the combination of trees and RKHS functions as base learners can achieve higher predictive accuracy, we report in Fig. 1 test mean square errors (MSEs) versus the number of boosting iterations for one data set

Read more

Summary

Introduction

Boosting algorithms [8,15,17,18,28] enjoy large popularity in both applied data science and machine learning research, among other things, due to their high predictive accuracy observed on a wide range of data sets [11]. We relax the assumption of using only one type of base learners by combining regression trees [7] and reproducing kernel Hilbert space (RKHS) regression functions [4,39] as base learners. RKHS regression is a form of non-parametric regression which shows state-of-the-art predictive accuracy for many data sets as it can, for instance, achieve near-optimal test errors [1,2], and kernel classifiers parallel the behaviors of deep networks as noted in Zhang et al [46]. Sigrist necessarily need to have low complexity [44], continuous, or smooth, RKHS functions have the potential to complement discontinuous trees as base learners

Summary of Results
Related Work
Boosting
Reproducing Kernel Hilbert Space Regression
Regression Trees
Combined Kernel and Tree Boosting
Reducing Computational Costs for Large Data
Simulation Study
Real-World Data
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call