Research on image content description in Chinese based on fusion of image global and local features.

Dongyi Kong,Hong Zhao,Xiangyan Zeng

doi:10.1371/journal.pone.0271322

Dongyi Kong, Hong Zhao + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0271322

Copy DOI

Abstract

Most image content modelling methods are designed for English description which is different form Chinese in syntax structure. The few existing Chinese image description models do not fully integrate the global features and the local features of an image, limiting the capability of the models to represent the details of the image. In this paper, an encoder-decoder architecture based on the fusion of global and local features is used to describe the Chinese image content. In the encoding stage, the global and local features of the image are extracted by the Convolutional Neural Network (CNN) and the target detection network, and fed to the feature fusion module. In the decoding stage, an image feature attention mechanism is used to calculate the weights of word vectors, and a new gating mechanism is added to the traditional Long Short-Term Memory (LSTM) network to emphasize the fused image features, and the corresponding word vectors. In the description generation stage, the beam search algorithm is used to optimize the word vector generation process. The integration of global and local features of the image is strengthened to allow the model to fully understand the details of the image through the above three stages. The experimental results show that the model improves the quality of Chinese description of image content. Compared with the baseline model, the score of CIDEr evaluation index improves by 20.07%, and other evaluation indices also improve significantly.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Research on image content description in Chinese based on fusion of image global and local features.

Abstract

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Journal: PloS one	Publication Date: Aug 29, 2022
License type: cc-by

Similar Papers

Global and Local Multi-scale Feature Fusion for Object Detection and Semantic Segmentation
Young-Chul Lim ... Minsung Kang
-
Young-Chul Lim, et. al.Young-Chul Lim ... Minsung Kang
01 Jun 2019
01 Jun 2019

Identifying Land Usage from Aerial Image using Feature Fusion of Thepade’s Sorted n-ary Block Truncation Coding and Bernsen Thresholding with Ensemble Methods
Sudeep D Thepade ... Rik Das
International Journal of Engineering and Advanced Technology | VOL. 9
Sudeep D Thepade, et. al.Sudeep D Thepade ... Rik Das
28 Feb 2020
International Journal of Engineering and Advanced Technology | VOL. 9

Dot-Product Based Global and Local Feature Fusion for Image Search
Zechao Hu ... Adrian G Bors
-
Zechao Hu, et. al.Zechao Hu ... Adrian G Bors
16 Oct 2022
16 Oct 2022

Gender Recognition Using Fusion of Local and Global Facial Features
Anwar M Mirza ... George Bebis
-
Anwar M Mirza, et. al.Anwar M Mirza ... George Bebis
01 Jan 2013
01 Jan 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Research on image content description in Chinese based on fusion of image global and local features.

Abstract

Talk to us

Similar Papers

More From: PloS one