Noise peeling methods to improve boosting algorithms

Waldyn Martinez,J Brian Gray

doi:10.1016/j.csda.2015.06.010

Abstract

Boosting refers to a family of methods that combine sequences of individual classifiers into highly accurate ensemble models through weighted voting. AdaBoost, short for “Adaptive Boosting”, is the most well-known boosting algorithm. AdaBoost has many strengths. Among them, there is sufficient empirical evidence pointing to its performance being generally superior to that of individual classifiers. In addition, even when combining a large number of weak learners, AdaBoost can be very robust to overfitting usually with lower generalization error than other competing ensemble methodologies, such as bagging and random forests. However, AdaBoost, as most hard margin classifiers, tends to be sensitive to outliers and noisy data, since it assigns observations that have been misclassified a higher weight in subsequent iterations. It has recently been proven that for any booster with a potential convex loss function, and any nonzero random classification noise rate, there is a data set, which can be efficiently learnable by the booster if there is no noise, but cannot be learned with accuracy better than 1/2 with random classification noise present. Several techniques to identify and potentially delete (peel) noisy samples in binary classification are proposed in order to improve the performance of AdaBoost. It is found that peeling methods generally perform better than AdaBoost and other noise resistant boosters, especially when high levels of noise are present in the data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Noise peeling methods to improve boosting algorithms

Abstract

Talk to us

Similar Papers

More From: Computational Statistics & Data Analysis

Lead the way for us

Journal: Computational Statistics & Data Analysis	Publication Date: Jul 10, 2015
Citations: 12

Similar Papers

Random classification noise defeats all convex potential boosters
Philip M Long ... Rocco A Servedio
Machine Learning | VOL. 78
Philip M Long, et. al.Philip M Long ... Rocco A Servedio
22 Dec 2009
Machine Learning | VOL. 78

Random classification noise defeats all convex potential boosters
Philip M Long ... Rocco A Servedio
-
Philip M Long, et. al.Philip M Long ... Rocco A Servedio
01 Jan 2008
01 Jan 2008

Separating models of learning with faulty teachers
Vitaly Feldman ... Shrenik Shah
Theoretical Computer Science | VOL. 410
Vitaly Feldman, et. al.Vitaly Feldman ... Shrenik Shah
31 Jan 2009
Theoretical Computer Science | VOL. 410

Randomized hypotheses and minimum disagreement hypotheses for learning with noise
Nicolò Cesa-Bianchi ... Paul Fischer
-
Nicolò Cesa-Bianchi, et. al.Nicolò Cesa-Bianchi ... Paul Fischer
01 Jan 1997
01 Jan 1997

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Noise peeling methods to improve boosting algorithms

Abstract

Talk to us

Similar Papers

More From: Computational Statistics &amp; Data Analysis

More From: Computational Statistics & Data Analysis