Memorial GAN With Joint Semantic Optimization for Unpaired Image Captioning.

Peipei Song,Mingliang Xu,Jinxing Zhou,Dan Guo,Meng Wang

doi:10.1109/tcyb.2022.3175012

Abstract

Most works of image captioning are implemented under the full supervision of paired image-caption data. Limited to expensive cost of data collection, the task of unpaired image captioning has attracted researchers' attention. In this article, we propose a novel memorial GAN (MemGAN) with the joint semantic optimization for unpaired image captioning. The core idea is to explore implicit semantic correlation between disjointed images and sentences through building a multimodal semantic-aware space (SAS). Concretely, each modality is mapped into a unified multimodal SAS, where SAS includes the semantic vectors of image I, visual concepts O, unpaired sentence S, and the generated caption C. We adopt the memory unit based on multihead attention and relational gate as a backbone to preserve and transit crucial multimodal semantics in the SAS for image caption generation and sentence reconstruction. Then, the memory unit is embedded into a GAN framework to exploit the semantic similarity and relevance in SAS, that is, imposing a joint semantic-aware optimization on SAS without supervision clues. To summarize, the proposed MemGAN learns the latent semantic relevance of SAS's multimodalities in an adversarial manner. Extensive experiments and qualitative results demonstrate the effectiveness of MemGAN, achieving improvements over state of the arts on unpaired image captioning benchmarks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Memorial GAN With Joint Semantic Optimization for Unpaired Image Captioning.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Cybernetics

Lead the way for us

Journal: IEEE Transactions on Cybernetics	Publication Date: Jul 1, 2023
Citations: 15

Similar Papers

MAENet: A novel multi-head association attention enhancement network for completing intra-modal interaction in image captioning
Nannan Hu ... Fan Feng
Neurocomputing | VOL. 519
Nannan Hu, et. al.Nannan Hu ... Fan Feng
18 Nov 2022
Neurocomputing | VOL. 519

Image Captioning with a Joint Attention Mechanism by Visual Concept Samples
Jin Yuan ... Zhiyong Li
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. 16
Jin Yuan, et. al.Jin Yuan ... Zhiyong Li
05 Jul 2020
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. 16

Synthesis of Vision and Language: Multifaceted Image Captioning Application
Arpit Gupta ... Himanshu Goyal
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 07
Arpit Gupta, et. al.Arpit Gupta ... Himanshu Goyal
23 Dec 2023
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 07

Neural attention for image captioning: review of outstanding methods
Zanyar Zohourianshahzadi ... Jugal K Kalita
Artificial Intelligence Review | VOL. 55
Zanyar Zohourianshahzadi, et. al.Zanyar Zohourianshahzadi ... Jugal K Kalita
29 Nov 2021
Artificial Intelligence Review | VOL. 55

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Memorial GAN With Joint Semantic Optimization for Unpaired Image Captioning.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Cybernetics