Choice of neighbor order in nearest-neighbor classification

Peter Hall,Richard J Samworth,Byeong U Park

doi:10.1214/07-aos537

Abstract

The kth-nearest neighbor rule is arguably the simplest and most intuitively appealing nonparametric classification procedure. However, application of this method is inhibited by lack of knowledge about its properties, in particular, about the manner in which it is influenced by the value of k; and by the absence of techniques for empirical choice of k. In the present paper we detail the way in which the value of k determines the misclassification error. We consider two models, Poisson and Binomial, for the training samples. Under the first model, data are recorded in a Poisson stream and are “assigned” to one or other of the two populations in accordance with the prior probabilities. In particular, the total number of data in both training samples is a Poisson-distributed random variable. Under the Binomial model, however, the total number of data in the training samples is fixed, although again each data value is assigned in a random way. Although the values of risk and regret associated with the Poisson and Binomial models are different, they are asymptotically equivalent to first order, and also to the risks associated with kernel-based classifiers that are tailored to the case of two derivatives. These properties motivate new methods for choosing the value of k.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The Annals of Statistics	Publication Date: Oct 1, 2008
Citations: 273	License type: implied-oa

R Discovery Prime

R Discovery Prime

Choice of neighbor order in nearest-neighbor classification

Abstract

Talk to us

Similar Papers

More From: The Annals of Statistics

Lead the way for us

Similar Papers

Comparison of Nearest Neighbor and Rule-based Decision Tree Classification in an Object-oriented Environment
A Laliberte ... E Fredrickson
-
A Laliberte, et. al.A Laliberte ... E Fredrickson
01 Jul 2006
01 Jul 2006

A Hybrid Classification Approach Based on Support Vector Machine and K-Nearest Neighbor for Remote Sensing Data
Gulnaz Alimjan ... Hurxida Jumahun
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 31
Gulnaz Alimjan, et. al.Gulnaz Alimjan ... Hurxida Jumahun
09 Mar 2017
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 31

Score tests for heterogeneity and overdispersion in zero‐inflated Poisson and binomial regression models
Daniel B Hall ... Kenneth S Berenhaut
Canadian Journal of Statistics | VOL. 30
Daniel B Hall, et. al.Daniel B Hall ... Kenneth S Berenhaut
01 Sep 2002
Canadian Journal of Statistics | VOL. 30

A nearest neighbor classifier based on virtual test samples for face recognition
Ningbo Zhu ... Ting Tang
Optik - International Journal for Light and Electron Optics | VOL. 126
Ningbo Zhu, et. al.Ningbo Zhu ... Ting Tang
14 Jul 2015
Optik - International Journal for Light and Electron Optics | VOL. 126

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Choice of neighbor order in nearest-neighbor classification

Abstract

Talk to us

Similar Papers

More From: The Annals of Statistics