Abstract

In deep metric learning (DML) techniques, understanding both the local and global characteristics of embedding space is essential. However, conventional DML techniques have two limitations as follows: First, Euclidean distance-based metrics never imply global information such as class variability because they only depend on the physical distance of samples. Second, they assume that the embedding space is simply a vector space which cannot represent complex data features. Therefore, we propose a novel loss function which can fully utilize characteristics of embedding space by using discriminant analysis and nonlinear mapping. With theoretical analysis, the superior performance of the proposed method is verified for the fine-grained retrieval datasets such as Cars196, CUB200-2011, Stanford online products, and In-shop clothes. Source code is available at <uri>https://github.com/kdhht2334/MCVA</uri>.

Highlights

  • DEEP metric learning (DML) has been used to compare the similarity of samples in a supervised or unsupervised manner, and has been applied to various fields such as product search [1, 2], and video highlight detection [3]

  • Linear discriminant analysis (DA) (LDA) is a tool to find d-dimensional eigenvectors {ei}im=1 that maximize the ratio of inter- and intra-class variability: Eopt =

  • P2ML shows that prototypical network [64], which can utilize the global class variability analysis based on LDS embedding is useful for information of the embedding space effectively

Read more

Summary

INTRODUCTION

DEEP metric learning (DML) has been used to compare the similarity of samples in a supervised or unsupervised manner, and has been applied to various fields such as product search [1, 2], and video highlight detection [3]. Most of the DML methods defined similarity metrics by considering only Euclidean distance, that is, the physical distance of embedding vectors derived from convolutional neural networks (CNNs). [23] tried to generalize unseen data through manifold similarity derived from random walk and meta class-based proxies It was somewhat dependent on the ensemble method, and it could not guarantee state-of-the-art performance because only N-pair [7] was adopted as the base technique. On the other hand, [30] assumed that embedding vectors exist on the linear subspaces of vector space, i.e., Grassmann manifold

Notation
Problem Statement
Riemannian Geometry
METHOD
Various P2ML Forms
In-depth Analysis of P2ML
15: Define P2ML-Tri and update parameters
Overall flow of P2ML
EXPERIMENTS
Method
Performance Evaluation
Ablation Study
Findings
CONCLUSION AND FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call