Abstract

Classification of imbalanced data is a well explored issue in the data mining and machine learning community where one class representation is overwhelmed by other classes. The Imbalanced distribution of data is a natural occurrence in real world datasets, so needed to be dealt with carefully to get important insights. In case of imbalance in data sets, traditional classifiers have to sacrifice their performances, therefore lead to misclassifications. This paper suggests a weighted nearest neighbor approach in a fuzzy manner to deal with this issue. We have adapted the ‘existing algorithm modification solution’ to learn from imbalanced datasets that classify data without manipulating the natural distribution of data unlike the other popular data balancing methods. The K nearest neighbor is a non-parametric classification method that is mostly used in machine learning problems. Fuzzy classification with the nearest neighbor clears the belonging of an instance to classes and optimal weights with improved nearest neighbor concept helping to correctly classify imbalanced data. The proposed hybrid approach takes care of imbalance nature of data and reduces the inaccuracies appear in applications of original and traditional classifiers. Results show that it performs well over the existing fuzzy nearest neighbor and weighted neighbor strategies for imbalanced learning.

Highlights

  • The last few decades have borne witness to various developments in science and technology.These developments have empowered the generation of enormous amounts of data and opportunities for mining useful information from this data and other activities of data science

  • Experimental analysis is done between our proposed method weighted fuzzy K nearest neighbor algorithm using adaptive approach (WFAKNN), neighbor weighted K nearest neighbor (NWKNN) [36], hybrid weighted nearest neighbor approach (AdptNWKNN), and fuzzy neighbor weighted approach (Fuzzy-NWKNN) [37]

  • Tab. 4 contains the results drawn on F-Measure, AUC and GMean of NWKNN, Adpt-NWKNN, Fuzzy-NWKNN and Weighted Fuzzy Adaptive KNN (WFAKNN) on all eight datasets for five values of K; 5 to 25

Read more

Summary

Introduction

The last few decades have borne witness to various developments in science and technology. These developments have empowered the generation of enormous amounts of data and opportunities for mining useful information from this data and other activities of data science. It can already be seen applied in various applications of data mining [1,2]. In such data mining applications many challenges occur at different levels. The classifier’s accuracy will be biased towards the majority class and minority class

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call