Abstract

The goal of this paper is to demonstrate via extensive simulation that implicit robustness can substantially outperform explicit robust inthe pattern recognition of contaminated high dimension low sample size data.Our work specially demonstrates via extensive computational simulations and applications to real life data, that random subspace ensemble learning machines, although not explicitly structurally designed as a robustness-inducing supervised learning paradigms, outperforms the structurally robustness-seekingclassiers on high dimension low sample size datasets. Random forest (RF),which is arguably the most commonly used random subspace ensemble learning method, is compared to various robust extensions/adaptations of the discriminant analysis classier, and our work reveals that RF, although not inherently designed to be robust to outliers, substantially outperforms the existing techniques specically designed to achieve robustness. Specically, by exploring different scenarios of the sample size n and the input space dimensionality palong with the corresponding capacity κ = n/p with κ < 1, we demonstratethrough extensive simulations that regardless of the contamination rate ϵ, RF predictively outperforms the explicitly robustness-inducing classication techniques when the intrinsic dimensionality of the data is large

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.