Abstract

The performance of the nearest neighbor (NN) algorithm is known to be highly sensitive to the distance measure used to find the NN of a query pattern. Another problem with the NN algorithm is that its performance degrades rapidly as the number of noisy (i.e., mislabeled) training samples increases. To tackle these problems, in this paper, a novel algorithm is proposed that uses the information from the training data to adapt the distance measure for the application in hand. For this purpose, similar to some other works reported in the literature, a weight is assigned to each training instance. The weight assigned to a training instance is used to calculate the distance of a query pattern with that instance. The major contribution of this paper is in the way that these parameters (i.e., the weights of training instances) are specified. The proposed method specifies the weight of training example based on the point that the entropy measure is minimized. We use some benchmark datasets from UCI repository to show that the scheme is quite successful in improving the performance of the basic NN algorithm. When dealing with noisy data, we show that the proposed scheme achieves significantly better performance in comparison with some other methods proposed in the literature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.