Modified K-nearest Neighbor Algorithm with Variant K Values

Kalyani C Waghmare,Balwant A

doi:10.14569/ijacsa.2020.0111029

Abstract

In Machine Learning K-nearest Neighbor is a renowned supervised learning method. The traditional KNN has the unlike requirement of specifying ‘K’ value in advance for all test samples. The earlier solutions of predicting ‘K’ values are mainly focused on finding optimal-k-values for all samples. The time complexity to obtain the optimal-k-values in the previous method is too high. In this paper, a Modified K-Nearest Neighbor algorithm with Variant K is proposed. The KNN algorithm is divided in the training and testing phase to find K value for every test sample. To get the optimal K value the data is trained for various K values with Min-Heap data structure of 2*K size. K values are decided based on the percentage of training data considered from every class. The Indian Classical Music is considered as a case study to classify it in different Ragas. The Pitch Class Distribution features are input to the proposed algorithm. It is observed that the use of Min-Heap has reduced the space complexity nonetheless Accuracy and F1-score for the proposed method are increased than traditional KNN algorithm as well as Support Vector Machine, Decision Tree Classifier for Self-Generated Dataset and Comp-Music Dataset.

Highlights

The K-nearest neighbors is a simple and effective classification algorithm
In [4] various distance functions are implemented with K-nearest Neighbor (KNN) on a medical dataset with different types of attributes
The proposed algorithm is presented as an extension of the traditional KNN algorithm

Summary

INTRODUCTION

The K-nearest neighbors is a simple and effective classification algorithm. The most important advantage is that the classification results can be interpreted. In [10] authors proposed an algorithm called Adaptive K-nearest neighbor (AdaKNN) algorithm which uses the density and distribution of the neighborhood of a test point and learns a suitable K for it with the help of artificial neural networks This strategy for rightly classifying the test point is employed by Wettschereck and Dietterich in [11] in which, the value of K is determined for different portions of input space by applying cross-validation in its local neighborhood. In [15] authors introduced the training phase in the KNN classification algorithm and proposed a k*Tree method to learn different optimal k values for different test samples.

PROPOSED ALGORITHM

EXPERIMENTAL RESULTS

CONCLUSION