Abstract

BackgroundRecursive partitioning is a non-parametric modeling technique, widely used in regression and classification problems. Model-based recursive partitioning is used to identify groups of observations with similar values of parameters of the model of interest. The mob() function in the party package in R implements model-based recursive partitioning method. This method produces predictions based on single tree models. Predictions obtained through single tree models are very sensitive to small changes to the learning sample. We extend the model-based recursive partition method to produce predictions based on multiple tree models constructed on random samples achieved either through bootstrapping (random sampling with replacement) or subsampling (random sampling without replacement) on learning data.ResultsHere we present an R package called “mobForest” that implements bagging and random forests methodology for model-based recursive partitioning. The mobForest package constructs large number of model-based trees and the predictions are aggregated across these trees resulting in more stable predictions. The package also includes functions for computing predictive accuracy estimates and plots, residuals plot, and variable importance plot.ConclusionThe mobForest package implements a random forest type approach for model-based recursive partitioning. The R package along with it source code is available at http://CRAN.R-project.org/package=mobForest.

Highlights

  • Recursive partitioning is a non-parametric modeling technique, widely used in regression and classification problems

  • Recursive partitioning methods like Random ForestsTM [1] are able to deal with large number of predictor variables even in the presence of complex interactions

  • “Classification and regression trees” (CART) [2] is one of the most commonly used recursive partitioning methods that can select from among a large number of variables that are most important in explaining the outcome variable

Read more

Summary

Introduction

Recursive partitioning is a non-parametric modeling technique, widely used in regression and classification problems. The mob() function in the party package in R implements modelbased recursive partitioning method This method produces predictions based on single tree models. We extend the model-based recursive partition method to produce predictions based on multiple tree models constructed on random samples achieved either through bootstrapping (random sampling with replacement) or subsampling (random sampling without replacement) on learning data. “Classification and regression trees” (CART) [2] is one of the most commonly used recursive partitioning methods that can select from among a large number of variables that are most important in explaining the outcome variable. The basic idea of CART algorithm is to sequentially split the data to identify groups of observations with similar values of response variable. The partitioning of the data continues till a stopping condition is met such as a) nodes contain observations of only one class, b) no predictor variable shows strong association within a given node, c) number of observations within a node are less than the specified minimum threshold

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.