Data pair selection for accurate classification based on information-theoretic metric learning

Takashi Maga,Kenta Mikawa,Masayuki Goto

doi:10.1504/ajmsa.2017.10004254

Abstract

Data classification is one of the main technique in data analysis which has become more and more important in various fields of business. Automatic classification is the problem that classification category label is learned from training data. One of the effective approaches for automatic classification is the k-nearest neighbour (kNN) method based on distances between data pairs, combining with the well-known distance metric learning. In this study, we focus on information-theoretic metric learning (ITML) method. In ITML, the optimisation problem is formulated as learning metric matrix so that the distance between each pair of data belonging to the same class becomes smaller than a constant, while the distance between each pair of data belonging to different classes becomes larger than the other constant. In this study, we propose an improved procedure by choosing the data-pairs which affect clarifying the boundaries effectively. We verify the effectiveness of our proposed method by conducting the simulation experiment with benchmark dataset.

Full Text