Ada: AnRPackage for Stochastic Boosting

Mark Culp,Kjell Johnson,George Michailidis

doi:10.18637/jss.v017.i02

Mark Culp, Kjell Johnson + Show 1 more

Open Access

https://doi.org/10.18637/jss.v017.i02

Copy DOI

Abstract

Boosting is an iterative algorithm that combines simple classification rules with mediocre performance in terms of misclassification error rate to produce a highly accurate classification rule. Stochastic gradient boosting provides an enhancement which incorporates a random mechanism at each boosting step showing an improvement in performance and speed in generating the ensemble. ada is an R package that implements three popular variants of boosting, together with a version of stochastic gradient boosting. In addition, useful plots for data analytic purposes are provided along with an extension to the multi-class case. The algorithms are illustrated with synthetic and real data sets.

Highlights

Boosting has proved to be an effective method to improve the performance of base classifiers, both theoretically and empirically
The following code shows how to predict with Real AdaBoost
Remark: The probability class estimate for any boosting algorithm is defined as P(Y = 1 |

Summary

Introduction

Boosting has proved to be an effective method to improve the performance of base classifiers, both theoretically and empirically. In addition to pharmacology, boosting algorithms have encompassed a wide range of applications including tumor identification and gene expression data [7], proteomics data [24], financial and marketing data [2; 18], fisheries data [17], and microscope imaging data [15] For many of these applications, ada will be useful since it implements well documented tools for assessing variable importance, evaluating training and testing error rates, and viewing pairwise plots of the data. The mboost package has to a large extent similar functionality to the gbm package and in addition implements the general gradient boosting framework using regression based learners In our experience, these packages are more suited for users in need of using boosting in models with a continuous or count type outcome.

A brief account of boosting algorithms

Historical perspective

1: Initialize weights wi

Stochastic boosting

3: Set wi

Connection to bagging

Functional structure

Construction of base learners using rpart

Description of the functions available in the ‘ada’ package

Creating an ‘ada’ object

Using an ‘ada’ object

Testing Results

Diagnostics and model selection

Solubility data

Stochastic boosting in a multi-class context

Summary and concluding remarks

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of statistical software	Publication Date: Jan 1, 2006
Citations: 102	License type: cc-by

R Discovery Prime

R Discovery Prime

Ada: AnRPackage for Stochastic Boosting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of statistical software

Lead the way for us

Similar Papers

Machine learning models trained on synthetic datasets of multiple sample sizes for the use of predicting blood pressure from clinical data in a national dataset.
Anmol Arora ... Sathishkumar V E
PloS one | VOL. 18
Anmol Arora, et. al.Anmol Arora ... Sathishkumar V E
16 Mar 2023
PloS one | VOL. 18

Synthetic Data Generation By Artificial Intelligence to Accelerate Translational Research and Precision Medicine in Hematological Malignancies
Saverio D'Amico ...
Blood | VOL. 140
Saverio D'Amico, et. al.Saverio D'Amico ...
15 Nov 2022
Blood | VOL. 140

Real and synthetic data sets for benchmarking key-value stores focusing on various data types and sizes
Hyuk-Yoon Kwon
Data in Brief | VOL. 30
Hyuk-Yoon KwonHyuk-Yoon Kwon
20 Mar 2020
Data in Brief | VOL. 30

Boosting for high-dimensional two-class prediction.
Rok Blagus ... Lara Lusa
BMC bioinformatics | VOL. 16
Rok Blagus, et. al.Rok Blagus ... Lara Lusa
21 Sep 2015
BMC bioinformatics | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ada: AnRPackage for Stochastic Boosting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of statistical software