Do Cross Modal Systems Leverage Semantic Relationships?

Shah Nawaz,M Kamran Janjua,Ignazio Gallo,Alessandro Calefati,Faisal Shafait,Arif Mahmood

doi:10.1109/iccvw.2019.00551

Abstract

Current cross modal retrieval systems are evaluated using R@K measure which does not leverage semantic relationships rather strictly follows the manually marked image text query pairs. Therefore, current systems do not generalize well for the unseen data in the wild. To handle this, we propose a new measure SemanticMap to evaluate the performance of cross modal systems. Our proposed measure evaluates the semantic similarity between the image and text representations in the latent embedding space. We also propose a novel cross modal retrieval system using a single stream network for bidirectional retrieval. The proposed system is based on a deep neural network trained using extended center loss, minimizing the distance of image and text descriptions in the latent space from the class centers. In our system, the text descriptions are also encoded as images which enabled us to use single stream network for both text and images. To the best of our knowledge, our work is the first of its kind in terms of employing a single stream network for cross modal retrieval systems. The proposed system is evaluated on two publicly available datasets including MSCOCO and Flickr30K and has shown comparable results to the current state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Do Cross Modal Systems Leverage Semantic Relationships?

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Learning an enhanced consensus representation for multi-view clustering via latent representation correlation preserving
Zhongyan Gui ... Zhiqiang Xie
Knowledge-Based Systems | VOL. 253
Zhongyan Gui, et. al.Zhongyan Gui ... Zhiqiang Xie
22 Jul 2022
Knowledge-Based Systems | VOL. 253

Zero-Shot Visual Recognition via Bidirectional Latent Embedding
Qian Wang ... Ke Chen
International Journal of Computer Vision | VOL. 124
Qian Wang, et. al.Qian Wang ... Ke Chen
28 Jun 2017
International Journal of Computer Vision | VOL. 124

MoveAE
Michael Suguitan ... Guy Hoffman
-
Michael Suguitan, et. al.Michael Suguitan ... Guy Hoffman
09 Mar 2020
09 Mar 2020

On Field Implementation of Real-Time Bit-Wear Estimation with Bit Agnostic Deep Learning Artificial Intelligence Model Along with Physics-Hybrid Features
Huang Xu ... Guodong David Zhan
-
Huang Xu, et. al.Huang Xu ... Guodong David Zhan
23 May 2023
23 May 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Do Cross Modal Systems Leverage Semantic Relationships?

Abstract

Talk to us

Similar Papers