Abstract
In deep metric learning (DML) techniques, understanding both the local and global characteristics of embedding space is essential. However, conventional DML techniques have two limitations as follows: First, Euclidean distance-based metrics never imply global information such as class variability because they only depend on the physical distance of samples. Second, they assume that the embedding space is simply a vector space which cannot represent complex data features. Therefore, we propose a novel loss function which can fully utilize characteristics of embedding space by using discriminant analysis and nonlinear mapping. With theoretical analysis, the superior performance of the proposed method is verified for the fine-grained retrieval datasets such as Cars196, CUB200-2011, Stanford online products, and In-shop clothes. Source code is available at <uri>https://github.com/kdhht2334/MCVA</uri>.
Highlights
DEEP metric learning (DML) has been used to compare the similarity of samples in a supervised or unsupervised manner, and has been applied to various fields such as product search [1, 2], and video highlight detection [3]
Linear discriminant analysis (DA) (LDA) is a tool to find d-dimensional eigenvectors {ei}im=1 that maximize the ratio of inter- and intra-class variability: Eopt =
P2ML shows that prototypical network [64], which can utilize the global class variability analysis based on LDS embedding is useful for information of the embedding space effectively
Summary
DEEP metric learning (DML) has been used to compare the similarity of samples in a supervised or unsupervised manner, and has been applied to various fields such as product search [1, 2], and video highlight detection [3]. Most of the DML methods defined similarity metrics by considering only Euclidean distance, that is, the physical distance of embedding vectors derived from convolutional neural networks (CNNs). [23] tried to generalize unseen data through manifold similarity derived from random walk and meta class-based proxies It was somewhat dependent on the ensemble method, and it could not guarantee state-of-the-art performance because only N-pair [7] was adopted as the base technique. On the other hand, [30] assumed that embedding vectors exist on the linear subspaces of vector space, i.e., Grassmann manifold
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.