Comparison of subsampling techniques for random subspace ensembles

Santhosh Pathical,Gursel Serpen

doi:10.1109/icmlc.2010.5581032

Abstract

This paper presents the comparison of three subsampling techniques for random subspace ensemble classifiers through an empirical study. A version of random subspace ensemble designed to address the challenges of high dimensional classification, entitled random subsample ensemble, within the voting combiner framework was evaluated for its performance for three different sampling methods which entailed random sampling without replacement, random sampling with replacement, and random partitioning. The random subsample ensemble was instantiated using three different base learners including C4.5, k-nearest neighbor, and naive Bayes, and tested on five high-dimensional benchmark data sets in machine learning. Simulation results helped ascertain the optimal sampling technique for the ensemble, which turned out to be the sampling without replacement.

Full Text