Abstract

Random sampling could enhance classification performance by selecting many representative samples to be included in the training dataset. The representative samples usually include the samples located at the border of each class or cluster. In this paper, a new sampling algorithm has been proposed which enforces the training sample to include the border points between classes. Considering a point as a border point depends on the density of that point according to number of neighbors within specific radiusby reflecting the idea of DBSCAN clustering algorithm. Experimental results on several UCI machine learning repository datasets prove that the new algorithm outperforms hold-out and cross-validation algorithms and enhanced the performance of several classification algorithms by 1% and reduces the error rate.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.