Abstract

Some of the most widely used text classification methods, such as the K-Nearest Neighbor (KNN) algorithm, the Native Bayes (NB) algorithm and the Support Vector Machine (SVM) algorithm, in terms of the good performance in balanced data classification, have performed poorly in imbalanced data classification. To solve this problem, many researchers have come up with their solutions, we also propose a new method to improve the performance of K-Nearest Neighbor classifier on imbalanced classification. In this paper, we combines K-Nearest Neighbor classifier with a new feature selection method called NFS, improved Synthetic Minority Over-sampling Technique (SMOTE) and Tomek Links Under-sampling Technique. The experimental results demonstrate that the improved method has a significant improvement on the classification efficiency of the bias dataset in the K-Nearest Neighbor classifier.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call