Similarity Reasoning and Filtration for Image-Text Matching

Haiwen Diao,Huchuan Lu,Ying Zhang,Lin Ma

doi:10.1609/aaai.v35i2.16209

Abstract

Image-text matching plays a critical role in bridging the vision and language, and great progress has been made by exploiting the global alignment between image and sentence, or local alignments between regions and words. However, how to make the most of these alignments to infer more accurate matching scores is still underexplored. In this paper, we propose a novel Similarity Graph Reasoning and Attention Filtration (SGRAF) network for image-text matching. Specifically, the vector-based similarity representations are firstly learned to characterize the local and global alignments in a more comprehensive manner, and then the Similarity Graph Reasoning (SGR) module relying on one graph convolutional neural network is introduced to infer relation-aware similarities with both the local and global alignments. The Similarity Attention Filtration (SAF) module is further developed to integrate these alignments effectively by selectively attending on the significant and representative alignments and meanwhile casting aside the interferences of non-meaningful alignments. We demonstrate the superiority of the proposed method with achieving state-of-the-art performances on the Flickr30K and MSCOCO datasets, and the good interpretability of SGR and SAF with extensive qualitative experiments and analyses.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Similarity Reasoning and Filtration for Image-Text Matching

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: May 18, 2021
Citations: 139

Similar Papers

RFE-SRN: Image-text similarity reasoning network based on regional feature enhancement
Xiaoyu Yang ... Guangqiang Yin
Neurocomputing | VOL. 518
Xiaoyu Yang, et. al.Xiaoyu Yang ... Guangqiang Yin
12 Nov 2022
Neurocomputing | VOL. 518

Analyzing the Interaction of RseA and RseB, the Two Negative Regulators of the σE Envelope Stress Response, Using a Combined Bioinformatic and Experimental Strategy
Nidhi Ahuja ... Carol A Gross
The Journal of biological chemistry | VOL. 284
Nidhi Ahuja, et. al.Nidhi Ahuja ... Carol A Gross
01 Feb 2009
The Journal of biological chemistry | VOL. 284

Updating self-location by self-motion and visual cues in familiar multiscale spaces.
Xuehui Lei ... Weimin Mou
Journal of Experimental Psychology: Human Learning & Memory | VOL. 47
Xuehui Lei, et. al.Xuehui Lei ... Weimin Mou
01 Sep 2021
Journal of Experimental Psychology: Human Learning & Memory | VOL. 47

GLAlign: Using global graph alignment to improve local graph alignment
Marianna Milano ... Pietro Hiram Guzzi
-
Marianna Milano, et. al.Marianna Milano ... Pietro Hiram Guzzi
01 Dec 2016
01 Dec 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Similarity Reasoning and Filtration for Image-Text Matching

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence