Abstract
BackgroundRecursive partitioning is a non-parametric modeling technique, widely used in regression and classification problems. Model-based recursive partitioning is used to identify groups of observations with similar values of parameters of the model of interest. The mob() function in the party package in R implements model-based recursive partitioning method. This method produces predictions based on single tree models. Predictions obtained through single tree models are very sensitive to small changes to the learning sample. We extend the model-based recursive partition method to produce predictions based on multiple tree models constructed on random samples achieved either through bootstrapping (random sampling with replacement) or subsampling (random sampling without replacement) on learning data.ResultsHere we present an R package called “mobForest” that implements bagging and random forests methodology for model-based recursive partitioning. The mobForest package constructs large number of model-based trees and the predictions are aggregated across these trees resulting in more stable predictions. The package also includes functions for computing predictive accuracy estimates and plots, residuals plot, and variable importance plot.ConclusionThe mobForest package implements a random forest type approach for model-based recursive partitioning. The R package along with it source code is available at http://CRAN.R-project.org/package=mobForest.
Highlights
Recursive partitioning is a non-parametric modeling technique, widely used in regression and classification problems
Recursive partitioning methods like Random ForestsTM [1] are able to deal with large number of predictor variables even in the presence of complex interactions
“Classification and regression trees” (CART) [2] is one of the most commonly used recursive partitioning methods that can select from among a large number of variables that are most important in explaining the outcome variable
Summary
Recursive partitioning is a non-parametric modeling technique, widely used in regression and classification problems. The mob() function in the party package in R implements modelbased recursive partitioning method This method produces predictions based on single tree models. We extend the model-based recursive partition method to produce predictions based on multiple tree models constructed on random samples achieved either through bootstrapping (random sampling with replacement) or subsampling (random sampling without replacement) on learning data. “Classification and regression trees” (CART) [2] is one of the most commonly used recursive partitioning methods that can select from among a large number of variables that are most important in explaining the outcome variable. The basic idea of CART algorithm is to sequentially split the data to identify groups of observations with similar values of response variable. The partitioning of the data continues till a stopping condition is met such as a) nodes contain observations of only one class, b) no predictor variable shows strong association within a given node, c) number of observations within a node are less than the specified minimum threshold
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have