Abstract

Instance-based learning often uses all instances in a training set to construct inference structures. The large number of instances and attributes may lead to high storage requirements and low search efficiency. Instance reduction is developed to address these issues by removing irrelevant instances and noises from the training set. However, existing reduction techniques still have the shortcomings of parameter dependency and relatively low accuracy and reduction rates. In this study, we present a natural neighborhood graph-based instance reduction algorithm, namely, NNGIR. A natural neighborhood graph (NaNG) is automatically constructed by the natural neighbor search algorithm. This graph can provide a compact description of the nearest neighbor relation over pairs of instances. NNGIR applies NaNG to divide the original training set into noisy, border and internal instances. Next, the algorithm obtains a reduced set by eliminating noisy and redundant points. NNGIR has three main advantages: (1) it is a non-parameter instance reduction algorithm due to the use of natural neighborhood graphs; (2) it strongly increases the reduction rate while maintaining or even improving the predication accuracy; (3) its fluctuation of reduction rates for different types of data sets is notably small. The efficiency of NNGIR is supported by the positive results from the experiments conducted on both synthetic and real data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call