Retrieval-enhanced adversarial training with dynamic memory-augmented attention for image paragraph captioning

Chunpu Xu,Min Yang,Xiang Ao,Ying Shen,Ruifeng Xu,Jinwen Tian

doi:10.1016/j.knosys.2020.106730

Abstract

Existing image paragraph captioning methods generate long paragraph captions solely from input images, relying on insufficient information. In this paper, we propose a retrieval-enhanced adversarial training with dynamic memory-augmented attention for image paragraph captioning (RAMP), which makes full use of the R-best retrieved candidate captions to enhance the image paragraph captioning via adversarial training. Concretely, RAMP treats the retrieved captions as reference captions to augment the discriminator during adversarial training, encouraging the image captioning model (generator) to incorporate informative content in retrieved captions into the generated caption. In addition, a retrieval-enhanced dynamic memory-augmented attention network is devised to keep track of the coverage information and attention history along with the update-chain of the decoder state, and therefore avoiding generating repetitive or incomplete image descriptions. Finally, a copying mechanism is applied to select words from the retrieved candidate captions, which are then put into the proper positions of the target caption so as to improve the fluency and informativeness of the generated caption. Extensive experiments on a benchmark dataset (i.e., Stanford) demonstrate that the proposed RAMP model significantly outperforms the state-of-the-art methods across multiple evaluation metrics. For reproducibility, we submit the code and data at https://github.com/anonymous-caption/RAMP.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Retrieval-enhanced adversarial training with dynamic memory-augmented attention for image paragraph captioning

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems

Lead the way for us

Journal: Knowledge-Based Systems	Publication Date: Dec 30, 2020
Citations: 14

Similar Papers

A Multi-metric Selection Strategy for Evolutionary Symbolic Regression
Hu Zhang ... Aimin Zhou
-
Hu Zhang, et. al.Hu Zhang ... Aimin Zhou
11 Oct 2020
11 Oct 2020

An Approach to Policy Gradient Reinforcement Learning with Multiple Evaluation Metrics
Yoshihiro Yasutake ... Chihiro Tagawa
-
Yoshihiro Yasutake, et. al.Yoshihiro Yasutake ... Chihiro Tagawa
01 Jun 2019
01 Jun 2019

A Classification Based Ensemble Pruning Framework with Multi-metric Consideration
Ya-Lin Zhang ... Qitao Shi
-
Ya-Lin Zhang, et. al.Ya-Lin Zhang ... Qitao Shi
04 Aug 2021
04 Aug 2021

Label-Noise Robust Generative Adversarial Networks
Takuhiro Kaneko ... Tatsuya Harada
-
Takuhiro Kaneko, et. al.Takuhiro Kaneko ... Tatsuya Harada
01 Jun 2019
01 Jun 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Retrieval-enhanced adversarial training with dynamic memory-augmented attention for image paragraph captioning

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems