Sentinel mechanism for visual semantic graph-based image captioning

Fen Xiao,Ningru Zhang,Wenfeng Xue,Xieping Gao

doi:10.1016/j.compeleceng.2024.109626

Abstract

Image captioning aims to generate a description of a given image. However, inherent representation differences between images and sentences make it difficult to align semantic meanings for captioning. Inspired by the human cognitive processes of understanding and describing images, a visual semantic sentinel mechanism based image captioning framework is proposed in this paper. Specifically, we introduce attribute nodes to enable a more comprehensive description of the objects and model the high-level relationships within a visual semantic graph. Then, the visual semantic sentinel mechanism is proposed to simulate the process of sentence generation. visual semantic graphs, visual features and previous language information in generated words are integrated with a semantic sentinel mechanism to align vision-language information and get contextually relevant descriptions of images. Comprehensive experiments on the challenging MS-COCO demonstrate our method outperforms the previous state-of-the-art methods. The code is publicly available at https://github.com/superatops/SSVSG.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Sentinel mechanism for visual semantic graph-based image captioning

Abstract

Talk to us

Similar Papers

More From: Computers and Electrical Engineering

Lead the way for us

Similar Papers

A heterogenous automatic feedback semi-supervised method for image reranking
Xin-Chao Xu ... Xin-Shun Xu
-
Xin-Chao Xu, et. al.Xin-Chao Xu ... Xin-Shun Xu
01 Jan 2013
01 Jan 2013

A novel mapping method for image semantics and visual features
Jun Yang ... Dan-Jun Xing
Journal of Computer Applications | VOL. 28
Jun Yang, et. al.Jun Yang ... Dan-Jun Xing
30 Sep 2009
Journal of Computer Applications | VOL. 28

Multi-feature Fusion for Predicting Social Media Popularity
Jinna Lv ... Bin Wu
-
Jinna Lv, et. al.Jinna Lv ... Bin Wu
23 Oct 2017
23 Oct 2017

Remote-sensing image retrieval by combining image visual and semantic features
M Wang ... T.Y Song
International Journal of Remote Sensing | VOL. 34
M Wang, et. al.M Wang ... T.Y Song
04 Mar 2013
International Journal of Remote Sensing | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sentinel mechanism for visual semantic graph-based image captioning

Abstract

Talk to us

Similar Papers

More From: Computers and Electrical Engineering