Abstract
Nearest neighbor (NN) learning algorithms, examples of the lazy learning paradigm, rely on a distance function to measure the similarity of testing examples with the stored training examples. Since certain attributes are more discriminative, while others can be less or totally irrelevant, attributes should be weighed differently in the distance function. Most previous studies on weight setting for NN learning algorithms are empirical. In this paper we describe our attempt on deciding theoretically optimal weights that minimize the predictive error for NN algorithms. Assuming a uniform distribution of examples in a 2-d continuous space, we first derive the average predictive error introduced by a linear classification boundary, and then determine the optimal weight setting for any polygonal classification region. Our theoretical results of optimal attribute weights can serve as a baseline or lower bound for comparing other empirical weight setting methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have