Distance metric learning (DML) aims to find a suitable measure to compute a distance between instances. Facilitated by side information, the learned metric can often improve the performance of similarity or distance based methods such as kNN. Theoretical analyses of DML focus on the learning effectiveness for squared Mahalanobis distance. Specifically, whether the Mahalanobis metric learned from the empirically sampled pairwise constraints is in accordance with the optimal metric optimized over the paired samples generated from the true distribution, and the sample complexity of this process. The excess risk could measure the quality of the generalization, i.e., the gap between the expected objective of empirical metric learned from a regularized objective with convex loss function and the one with the optimal metric. Given N training examples, existing analyses over this non-i.i.d. learning problem have proved the excess risk of DML converges to zero at a rate of $${\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) $$ . In this paper, we obtain a faster convergence rate of DML, $${\mathcal {O}}\left( \frac{1}{N}\right) $$ , when learning the distance metric with a smooth loss function and a strongly convex objective. In addition, when the problem is relatively easy, and the number of training samples is large enough, this rate can be further improved to $${\mathcal {O}}\left( \frac{1}{N^2}\right) $$ . Synthetic experiments validate that DML can achieve the specified faster generalization rate, and results under various settings help explore the theoretical properties of DML a lot.
Read full abstract