Abstract
With the rapid development of Internet of Things technology, the image data on the Internet are growing at an amazing speed. How to describe the semantic content of massive image data is facing great challenges. Attentional mechanisms originate from the study of human vision. In cognitive science, due to bottlenecks in information processing, humans selectively attend to a portion of all information while ignoring the rest of the visible information. This study mainly discusses the natural language description generation method of Internet of Things intelligent image based on attention mechanism. In this study, a CMOS sensor based on Internet of Things technology is used for image data acquisition and display. FPGA samples cis16bit parallel port data, writes FIFO, stores image data, and then transmits it to host computer for display through network interface. In order to minimize the value of cross-entropy loss function, maximum-likelihood estimation is used to maximize the joint probability of word sequences in the language model when sentence descriptions are generated using the encoder-decoder framework. At each moment, in addition to image features, additional text features are input. Image feature vector and text feature vector are weighted and summed by attention mechanism at each time. In decoding, the attention mechanism gives each image region feature weight, and the long-term and short-term memory network decodes in turn, but the long-term and short-term memory network has limited decoding ability. We use bidirectional long-term and short-term memory network instead of long-term memory network, and dynamically focus on context information through forward LSTM and reverse LSTM. The specificity of the proposed network is 5% higher than that of the 3D convolution residual link network. The results show that the performance of image description model is improved by inputting image context and text context into long-term memory network decoder.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.