Active metric learning for supervised classification

Krishnan Kumaran,Dimitri J Papageorgiou,Martin Takac,Laurens Lueg,Nicolas V Sahinidis

doi:10.1016/j.compchemeng.2020.107132

Krishnan Kumaran, Dimitri J Papageorgiou + Show 3 more

Open Access

https://doi.org/10.1016/j.compchemeng.2020.107132

Copy DOI

Abstract

Clustering and classification critically rely on distance metrics that provide meaningful comparisons between data points. To this end, learning optimal distance functions from data, known as metric learning, aims to facilitate supervised classification, particularly in high-dimensional spaces where visualization is challenging or infeasible. In particular, the Mahalanobis metric is the default choice due to simplicity and interpretability as a transformation of the simple Euclidean metric using a combination of rotation and scaling.In this work, we present several novel contributions to metric learning, both by way of formulation as well as solution methods. Our approach is motivated by agglomerative clustering with certain novel modifications that enable natural interpretation of the user-defined classes as clusters with the optimal metric. Our approach generalizes and improves upon leading methods by removing reliance on pre-designated “target neighbors,” “triplets,” and “similarity pairs.” Starting with the definition of a generalized metric that has the Mahalanobis metric as the second order term, we propose an objective function for metric selection that does not aim to isolate classes from each other like most previous work, but tries to distort the space minimally by aggregating co-class members into local clusters. Further, we formulate the problem as a mixed-integer optimization that can be solved efficiently for small/medium datasets and approximated for larger datasets.Another salient feature of our method is that it facilitates active learning by recommending precise regions to sample using the optimal metric to improve classification performance. These regions are indicated by boundary and outlier points of the dataset as defined by the metric. This targeted acquisition can significantly reduce computation and data acquisition by ensuring training data completeness, representativeness, and economy, which could also provide advantages in training data selection for other established methods like Deep Learning and Random Forests. We demonstrate classification and computational performance of our approach through several simple and intuitive examples, followed by results on real image and benchmark datasets.

Full Text