Abstract
Todeschini, R., 1989. k-Nearest neighbour method: the influence of data transformations and metrics. Chemometrics and Intelligent Laboratory Systems, 6: 213–220. The k-nearest neighbour (KNN) method is widely used in pattern recognition due to its conceptual simplicity, general applicability and efficiency. The common use of the KNN method is by using the Euclidean metric to measure distances between objects of the data set. On the other hand, the use of different metrics may greatly affect the results of pattern recognition methods but there seems to be little consensus as to which types of metrics or similarity coefficients are most generally applicable and in which cases. A study has been made of the influence of six different metrics and four data transformations on the KNN method applied to several data sets. The results show that the Lance-Williams, Manhattan and Camberra metrics give error rates comparable to and, in some cases, better than those obtained by the linear discriminant analysis. Insofar as data transformations are concerned, scaling with respect to the maximum seems to give better results.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.