Modeling Local and Global Contexts for Image Captioning

Peng Yao,Longteng Guo,Jing Liu,Jiangyun Li

doi:10.1109/icme46284.2020.9102935

Abstract

Image captioning aims to first observe an image, most notably the involved objects that are highly context-dependent, and then depict it with a natural description. However, most of the current models solely use the isolated objects vectors as image representations, ignoring the contexts among them. In this paper, we introduce a Local-Global Context (LGC) network, endowing the independent object features with shortrange perception (local contexts) and long-range dependence (global contexts). LGC network can be viewed as feature refiner, much beneficial to reason the novel objects and verbal words for the caption decoder. The local contexts are modeled with 1-D group convolution on adjacent objects, strengthening the local connections. Still further, self-attention mechanism is utilized to model the global contexts by correlating all the local contexts. Extensive experiments on MSCOCO dataset demonstrate that LGC network can easily plug into almost any neural captioning models and significantly improve the model performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Modeling Local and Global Contexts for Image Captioning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

The Role of Global and Local Contexts in Pronoun Comprehension
Bing Gao
Acta Psychologica Sinica | VOL. 40
Bing GaoBing Gao
19 Sep 2008
Acta Psychologica Sinica | VOL. 40

The Effects of Global and Local Stimulus Context on Auditory Frequency Discrimination
I Tsaliach, ... M Amel,
Journal of Basic and Clinical Physiology and Pharmacology | VOL. 21
I Tsaliach,, et. al.I Tsaliach, ... M Amel,
01 Jun 2010
Journal of Basic and Clinical Physiology and Pharmacology | VOL. 21

Influential Global and Local Contexts Guided Trace Representation for Fault Localization
Zhuo Zhang ... Xiaoguang Mao
ACM Transactions on Software Engineering and Methodology | VOL. 32
Zhuo Zhang, et. al.Zhuo Zhang ... Xiaoguang Mao
26 Apr 2023
ACM Transactions on Software Engineering and Methodology | VOL. 32

Single-Shot Global and Local Context Refinement Neural Network for Head Detection
Jingyuan Hu ... Zhouwang Yang
Future Internet | VOL. 14
Jingyuan Hu, et. al.Jingyuan Hu ... Zhouwang Yang
19 Dec 2022
Future Internet | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modeling Local and Global Contexts for Image Captioning

Abstract

Talk to us

Similar Papers