Abstract

Data reduction is very important especially when using the k-NN Classifier on large datasets. Many prototype selection and generation Algorithms have been proposed aiming to condense the initial training data as much as possible and keep the classification accuracy at a high level. The Prototype Selection by Clustering (PSC) algorithm is one of them and is based on a cluster generation procedure. Contrary to many other prototype selection and generation algorithms, its main goal is the fast execution of the data reduction procedure rather than high reduction rate. In this paper, we demonstrate that the reduction rate and the classification accuracy of PSC can be improved by generating a larger number of clusters. Moreover, we compare the performance of the particular algorithm with two state-of-the-art algorithms, one selection and one generation, using six real life datasets. The experimental results indicate that the classification performance of the Prototype Selection by Clustering algorithm is comparable with that of its competitors when using many clusters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.