Natural Language Description Generation Method of Intelligent Image Internet of Things Based on Attention Mechanism

Jianquan Ouyang,Haoyang Yu

doi:10.1155/2022/1200469

Abstract

With the rapid development of Internet of Things technology, the image data on the Internet are growing at an amazing speed. How to describe the semantic content of massive image data is facing great challenges. Attentional mechanisms originate from the study of human vision. In cognitive science, due to bottlenecks in information processing, humans selectively attend to a portion of all information while ignoring the rest of the visible information. This study mainly discusses the natural language description generation method of Internet of Things intelligent image based on attention mechanism. In this study, a CMOS sensor based on Internet of Things technology is used for image data acquisition and display. FPGA samples cis16bit parallel port data, writes FIFO, stores image data, and then transmits it to host computer for display through network interface. In order to minimize the value of cross-entropy loss function, maximum-likelihood estimation is used to maximize the joint probability of word sequences in the language model when sentence descriptions are generated using the encoder-decoder framework. At each moment, in addition to image features, additional text features are input. Image feature vector and text feature vector are weighted and summed by attention mechanism at each time. In decoding, the attention mechanism gives each image region feature weight, and the long-term and short-term memory network decodes in turn, but the long-term and short-term memory network has limited decoding ability. We use bidirectional long-term and short-term memory network instead of long-term memory network, and dynamically focus on context information through forward LSTM and reverse LSTM. The specificity of the proposed network is 5% higher than that of the 3D convolution residual link network. The results show that the performance of image description model is improved by inputting image context and text context into long-term memory network decoder.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Security and Communication Networks	Publication Date: Apr 11, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Natural Language Description Generation Method of Intelligent Image Internet of Things Based on Attention Mechanism

Abstract

Talk to us

Similar Papers

More From: Security and Communication Networks

Lead the way for us

Similar Papers

Research on Volleyball Video Intelligent Description Technology Combining the Long-Term and Short-Term Memory Network and Attention Mechanism.
Yuhua Gao ... Heng Zhang
Computational Intelligence and Neuroscience | VOL. 2021
Yuhua Gao, et. al.Yuhua Gao ... Heng Zhang
01 Jan 2020
Computational Intelligence and Neuroscience | VOL. 2021

A low carbon management model for regional energy economies based on blockchain technology
Siyue Tan ... Guangmin Liu
Heliyon | VOL. 9
Siyue Tan, et. al.Siyue Tan ... Guangmin Liu
01 Sep 2023
Heliyon | VOL. 9

Analysis of Volleyball Video Intelligent Description Technology Based on Computer Memory Network and Attention Mechanism.
Zhongzi Zhang
Computational Intelligence and Neuroscience | VOL. 2021
Zhongzi ZhangZhongzi Zhang
01 Jan 2020
Computational Intelligence and Neuroscience | VOL. 2021

IoT-assisted feature learning for surface settlement prediction caused by shield tunnelling
Zhu Wen ... Yehui Shi
Computer Communications | VOL. 203
Zhu Wen, et. al.Zhu Wen ... Yehui Shi
15 Mar 2023
Computer Communications | VOL. 203

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Natural Language Description Generation Method of Intelligent Image Internet of Things Based on Attention Mechanism

Abstract

Talk to us

Similar Papers

More From: Security and Communication Networks