Classification is an effective technique commonly used in data analysis by systematically arranging groups or categories according to established criteria. The classifier's success relies on the classifier itself and the quality of the data. However, in real-world applications, it is inevitable for datasets to contain mislabeled instances, which may cause misclassification challenges that classifiers have to handle. This study aims for a quantitative assessment of the classification of noisy data through a new kNN-based classification algorithm and to increase the performance of classical kNN by efficiently classifying the data. We perform various numerical experiments on real-world data sets to prove our new algorithm's performance. We obtain high standards of accuracy levels on various noisy datasets. We propose that this new technique can provide high standard accuracy levels in binary classification problems. We compared the new kNN and classical kNN algorithms in various noise levels (10%, 20%, 30%, and 40%) on distinct datasets by measuring in terms of test accuracy. Also, we compared our new algorithm with popular classification algorithms and in the vast majority, we obtained better test accuracy results.
Read full abstract