Adaptive Latent Graph Representation Learning for Image-Text Matching.

Mengxiao Tian,Yunde Jia,Xinxiao Wu

doi:10.1109/tip.2022.3229631

Abstract

Image-text matching is a challenging task due to the modality gap. Many recent methods focus on modeling entity relationships to learn a common embedding space of image and text. However, these methods suffer from distractions of entity relationships such as irrelevant visual regions in an image and noisy textual words in a text. In this paper, we propose an adaptive latent graph representation learning method to reduce the distractions of entity relationships for image-text matching. Specifically, we use an improved graph variational autoencoder to separate the distracting factors and latent factor of relationships and jointly learn latent textual graph representations, latent visual graph representations, and a visual-textual graph embedding space. We also introduce an adaptive cross-attention mechanism to perform feature attending on the latent graph representations across images and texts, thus further narrowing the modality gap to boost the matching performance. Extensive experiments on two public datasets, Flickr30K and COCO, show the effectiveness of our method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Adaptive Latent Graph Representation Learning for Image-Text Matching.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Image Processing

Lead the way for us

Journal: IEEE Transactions on Image Processing	Publication Date: Jan 1, 2023
Citations: 7

Similar Papers

Sparse Relational Topic Models for Document Networks
Aonan Zhang ... Bo Zhang
-
Aonan Zhang, et. al.Aonan Zhang ... Bo Zhang
01 Jan 2013
01 Jan 2013

Multiview Clustering via Proximity Learning in Latent Representation Space.
Bao-Yu Liu ... Philip S Yu
IEEE Transactions on Neural Networks and Learning Systems | VOL. 34
Bao-Yu Liu, et. al.Bao-Yu Liu ... Philip S Yu
01 Feb 2023
IEEE Transactions on Neural Networks and Learning Systems | VOL. 34

Feature selection via Non-convex constraint and latent representation learning with Laplacian embedding
Ronghua Shang ... Licheng Jiao
Expert Systems with Applications | VOL. 208
Ronghua Shang, et. al.Ronghua Shang ... Licheng Jiao
22 Jul 2022
Expert Systems with Applications | VOL. 208

Gene selection for microarray data classification via dual latent representation learning
Xiao Zheng ... Chujie Zhang
Neurocomputing | VOL. 461
Xiao Zheng, et. al.Xiao Zheng ... Chujie Zhang
22 Jul 2021
Neurocomputing | VOL. 461

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Adaptive Latent Graph Representation Learning for Image-Text Matching.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Image Processing