Abstract

The nearest neighbor rule identifies the category of an unknown element according to its known nearest neighbors’ categories. This technique is efficient in many fields as event recognition, text categorization and object recognition. Its prime advantage is its simplicity, but its main inconvenience is its computing complexity for large training sets. This drawback was dealt by the researchers’ community as the problem of prototype selection. Trying to solve this problem several techniques presented as condensing techniques were proposed. Condensing algorithms try to determine a significantly reduced set of prototypes keeping the performance of the 1-NN rule on this set close to the one reached on the complete training set. In this paper we present a survey of some condensing KNN techniques which are CNN, RNN, FCNN, Drop1-5, DEL, IKNN, TRKNN and CBP. All these techniques can improve the efficiency in computation time. But these algorithms fail to prove the minimality of their resulting set. For this, one possibility is to hybridize them with other algorithms, called modern heuristics or metaheuristics, which, themselves, can improve the solution. The metaheuristics that have proven results in the selection of attributes are principally genetic algorithms and tabu search. We will also shed light in this paper on some recent techniques focusing on this template.

Highlights

  • The K-nearest neighbor classification rule (KNN) proposed by T

  • The KNN algorithm uses the relevance of the elements to eliminate those redundant in large networks data, and the subset obtained is the initial solution of the tabu search algorithm

  • In this paper we tried to present and compare sets reduction techniques based on the principle of nearest neighbor

Read more

Summary

INTRODUCTION

The K-nearest neighbor classification rule (KNN) proposed by T. E. Hart [4], is a powerful classification method that allows an almost infallible classification of an unknown prototype through a set of training prototypes. Hart [4], is a powerful classification method that allows an almost infallible classification of an unknown prototype through a set of training prototypes It is widely used in pattern recognition [20] [18], text categorization [10] [6], object recognition [8] and event recognition [23] applications. The databases, used in some areas such as intrusion detection, are constantly and dynamically updated. This constitutes one of the main inconveniences of the KNN rule. The scientific community has tackled these problems and proposed a selection of prototypes which could modify an initial set of prototypes by reducing its size in order to improve the classification performance

PROTOTYPE SELECTION
IMPROVING PROTOTYPE SELECTION
HEURISTIC METHODS BASED ON THE CONDENSING KNN RULE
The removal of the instance does not increase the cost of length encoding
Comparison
METAHEURISTICS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call