Feature Weighting by Maximum Distance Minimization

Jens Hocke,Thomas Martinetz

doi:10.1007/978-3-642-40728-4_53

Abstract

The k-NN algorithm is still very popular due to its simplicity and the easy interpretability of the results. However, the often used Euclidean distance is an arbitrary choice for many datasets. It is arbitrary because often the data is described by measurements from different domains. Therefore, the Euclidean distance often leads to a bad classification rate of k-NN. By feature weighting the scaling of dimensions can be adapted and the classification performance can be significantly improved. We here present a simple linear programming based method for feature weighting, which in contrast to other feature weighting methods is robust to the initial scaling of the data dimensions. An evaluation is performed on real-world datasets from the UCI repository with comparison to other feature weighting algorithms and to Large Margin Nearest Neighbor Classification (LMNN) as a metric learning algorithm.

Full Text