Multi-Attention Generative Adversarial Network for image captioning

Yiwei Wei,Leiquan Wang,Haiwen Cao,Mingwen Shao,Chunlei Wu

doi:10.1016/j.neucom.2019.12.073

Abstract

Recently, it has been shown that generative-adversarial-nets (GANs) can be directly utilized as an extension of traditional reinforcement-learning in image captioning tasks. However, the GANs-based methods generate captions as a function of only local points in the feature map without capturing non-local information. In this paper, a Multi-Attention mechanism is first proposed by utilizing both of the local and non-local evidence for more effective feature representation and reasoning in image captioning. Based on the mechanism, a Multi-Attention Generative Adversarial Image Captioning Network (MAGAN) is also proposed which contains a Multi-Attention generator and a Multi-Attention discriminator. The proposed generator is designed to generate more accurate sentences, while the proposed discriminator is employed to determine whether generated sentences are human described or machine generated. Extensive experiments are conducted to validate the proposed framework on MSCOCO benchmark dataset, and it achieves very competitive results evaluated by the evaluation server of MS COCO captioning challenge.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Attention Generative Adversarial Network for image captioning

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Dec 23, 2019
Citations: 30

Similar Papers

An Image Captioning Algorithm Based on Combination Attention Mechanism
Jinlong Liu ... Zhilu Wu
Electronics | VOL. 11
Jinlong Liu, et. al.Jinlong Liu ... Zhilu Wu
27 Apr 2022
Electronics | VOL. 11

Image captioning via hierarchical attention mechanism and policy gradient optimization
Shiyang Yan ... Bailing Zhang
Signal Processing | VOL. 167
Shiyang Yan, et. al.Shiyang Yan ... Bailing Zhang
03 Oct 2019
Signal Processing | VOL. 167

Image Captioning using Deep Learning: Text Augmentation by Paraphrasing via Backtranslation
Ingrid Ravn Turkerud ... Ole Jakob Mengshoel
-
Ingrid Ravn Turkerud, et. al.Ingrid Ravn Turkerud ... Ole Jakob Mengshoel
05 Dec 2021
05 Dec 2021

A complete human verified Turkish caption dataset for MS COCO and performance evaluation with well-known image caption models trained against it
Sina Berk Golech ... Saltuk Bugra Karacan
-
Sina Berk Golech, et. al.Sina Berk Golech ... Saltuk Bugra Karacan
16 Nov 2022
16 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Attention Generative Adversarial Network for image captioning

Abstract

Talk to us

Similar Papers

More From: Neurocomputing