A Multiview Text Imagination Network Based on Latent Alignment for Image-Text Matching

Heng Shang,Guoshuai Zhao,Xueming Qian,Jing Shi

doi:10.1109/mis.2023.3265176

Abstract

In image-text matching field, one of the keys to improving performance is to extract features with more semantic information. Existing works demonstrate that semantic enrichment through knowledge expansion can improve the performance. Most of them expand image features. However, the shortage of semantic information in text modality and the unilateral character of the view are often bottlenecks that limit the performance of image-text matching models. To solve the two problems, we aggregate knowledge from multiple views and propose a Word Imagination Graph (WIG). WIG can be used to expand textual semantic information by imagination based on input images. Then, utilizing WIG, we construct a novel Multi-View Text Imagination Network (MTIN). MTIN enables latent alignment of images and texts on tags which can assist matching on a semantic level. Results on Flickr30K and MS-COCO datasets can demonstrate the effectiveness of our method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Multiview Text Imagination Network Based on Latent Alignment for Image-Text Matching

Abstract

Talk to us

Similar Papers

More From: IEEE Intelligent Systems

Lead the way for us

Journal: IEEE Intelligent Systems	Publication Date: May 1, 2023
Citations: 3

Similar Papers

Parallel-fusion LSTM with synchronous semantic and visual information for image captioning
Jing Zhang ... Zhe Wang
Journal of Visual Communication and Image Representation | VOL. 75
Jing Zhang, et. al.Jing Zhang ... Zhe Wang
01 Feb 2021
Journal of Visual Communication and Image Representation | VOL. 75

Image captioning using relevance attention and ITEM encoding
Hongliang Zhang ... Guangming Li
-
Hongliang Zhang, et. al.Hongliang Zhang ... Guangming Li
14 Dec 2021
14 Dec 2021

End-to-end training image-text matching network
Depeng Wang ... Yibo Sun
-
Depeng Wang, et. al.Depeng Wang ... Yibo Sun
01 Jul 2022
01 Jul 2022

A Model of Deceitful Information Communication: Some Views on Theory and Practice of Semantic Information
Nan Wang ... Bocong Li
-
Nan Wang, et. al.Nan Wang ... Bocong Li
09 Jun 2017
09 Jun 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Multiview Text Imagination Network Based on Latent Alignment for Image-Text Matching

Abstract

Talk to us

Similar Papers

More From: IEEE Intelligent Systems