Classification performance enhancement using boundary based sampling algorithm

Safaa O Al-Mamory

doi:10.1109/ntict.2017.7976104

Abstract

Random sampling could enhance classification performance by selecting many representative samples to be included in the training dataset. The representative samples usually include the samples located at the border of each class or cluster. In this paper, a new sampling algorithm has been proposed which enforces the training sample to include the border points between classes. Considering a point as a border point depends on the density of that point according to number of neighbors within specific radiusby reflecting the idea of DBSCAN clustering algorithm. Experimental results on several UCI machine learning repository datasets prove that the new algorithm outperforms hold-out and cross-validation algorithms and enhanced the performance of several classification algorithms by 1% and reduces the error rate.

Full Text