Cross-Modal Image-Text Retrieval with Semantic Consistency

Hui Chen,Jungong Han,Guiguang Ding,Sicheng Zhao,Zijia Lin

doi:10.1145/3343031.3351055

Abstract

Cross-modal image-text retrieval has been a long-standing challenge in the multimedia community. Existing methods explore various complicated embedding spaces to assess the semantic similarity between a given image-text pair, but consider no/little about the consistency across them. To remedy this situation, we introduce the idea of semantic consistency for learning various embedding spaces jointly. Specifically, similar to the previous works, we start by constructing two different embedding spaces, namely the image-grounded embedding space and the text-grounded embedding space. However, instead of learning these two embedding spaces separately, we incorporate a semantic consistency constraint in the common ranking objective function such that both embedding spaces can be learned simultaneously and benefit from each other to gain performance improvement. We conduct extensive experiments on three benchmark datasets, \ie Flickr8k, Flickr30k and MS COCO. Results show that our model outperforms the state-of-the-art models on all three datasets, which can well demonstrate the effectiveness and superiority of the introduction of semantic consistency. Our source code is released at: \urlhttps://github.com/HuiChen24/SemanticConsistency.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cross-Modal Image-Text Retrieval with Semantic Consistency

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings
Michael Wray ... Diane Larlus
-
Michael Wray, et. al.Michael Wray ... Diane Larlus
01 Oct 2019
01 Oct 2019

Cross-Modal Image Retrieval Considering Semantic Relationships With Many-to-Many Correspondence Loss
Huaying Zhang ... Rintaro Yanagi
IEEE Access | VOL. 11
Huaying Zhang, et. al.Huaying Zhang ... Rintaro Yanagi
01 Jan 2023
IEEE Access | VOL. 11

Semantic Modeling of Textual Relationships in Cross-modal Retrieval
Jing Yu ... Chenghao Yang
-
Jing Yu, et. al.Jing Yu ... Chenghao Yang
01 Jan 2019
01 Jan 2019

Cross-modal Image Retrieval Considering Semantic Relationships with Object Information
Huaying Zhang ... Takahiro Ogawa
-
Huaying Zhang, et. al.Huaying Zhang ... Takahiro Ogawa
18 Oct 2022
18 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cross-Modal Image-Text Retrieval with Semantic Consistency

Abstract

Talk to us

Similar Papers