Reinforced Cross-Media Correlation Learning by Context-Aware Bidirectional Translation

Yuxin Peng,Jinwei Qi

doi:10.1109/tcsvt.2019.2907400

Abstract

The heterogeneity gap leads to inconsistent distributions and representations between image and text, which rises a challenging task to measure their similarities and construct cross-media correlation between them. The existing works mainly model the cross-media correlation in a common subspace, which causes insufficient correlation modeling in such third-party subspace with intermediate unidirectional transformation. Inspired by the recent advances of neural machine translation, which aims to establish a corresponding relationship between two entirely different languages, we can naturally discover that it has striking common characteristic with cross-media correlation learning to consider image and text as bilingual pairs, where the image is treated as a special kind of language to provide visual description, so that bidirectional transformation can be conducted between image and text to effectively explore cross-media correlation in the feature space of each media type. Thus, we propose a reinforced cross-media bidirectional translation (RCBT) approach to model the correlation between visual and textual descriptions. First, cross-media bidirectional translation mechanism is proposed to conduct direct transformation between the bilingual pairs of visual and textual descriptions bidirectionally, where the cross-media correlation can be effectively captured in both feature spaces of image and text through bidirectional translation training. Second, cross-media context-aware network with residual attention is proposed to exploit the rich spatial and temporal context hints with cross-media convolutional recurrent neural network, which can lead to more precise correlation learning for promoting bidirectional translation process. Third, cross-media reinforcement learning is proposed to perform a two-agent communication game played as a round between image and text to boost the bidirectional translation process, and we further extract inter-media and intra-media reward signals to provide complementary clues for learning cross-media correlation. Extensive experiments are conducted on cross-media retrieval to verify the effectiveness of our proposed RCBT approach, compared with 11 state-of-the-art methods on three cross-media datasets.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Reinforced Cross-Media Correlation Learning by Context-Aware Bidirectional Translation

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society

Lead the way for us

Journal: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society	Publication Date: Apr 11, 2019
Citations: 61

Similar Papers

Cross-modal Bidirectional Translation via Reinforcement Learning
Jinwei Qi ... Yuxin Peng
-
Jinwei Qi, et. al.Jinwei Qi ... Yuxin Peng
01 Jul 2018
01 Jul 2018

Cross-media Multi-level Alignment with Relation Attention Network
Jinwei Qi ... Yuxin Peng
-
Jinwei Qi, et. al.Jinwei Qi ... Yuxin Peng
01 Jul 2018
01 Jul 2018

MAVA: Multi-level Adaptive Visual-textual Alignment by Cross-media Bi-attention Mechanism.
Yuxin Peng ... Jinwei Qi
IEEE Transactions on Image Processing | VOL. 29
Yuxin Peng, et. al.Yuxin Peng ... Jinwei Qi
22 Nov 2019
IEEE Transactions on Image Processing | VOL. 29

Incorporating self-management in prosthetic rehabilitation: case report of an integrated knowledge-to-action process.
Sacha Van Twillert ... Klaas Postema
Physical Therapy | VOL. 95
Sacha Van Twillert, et. al.Sacha Van Twillert ... Klaas Postema
26 Jun 2014
Physical Therapy | VOL. 95

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reinforced Cross-Media Correlation Learning by Context-Aware Bidirectional Translation

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society