New Encoder Learning for Captioning Heavy Rain Images via Semantic Visual Feature Matching

Chang-Hwan Son,Pung-Hwi Ye

doi:10.2352/j.imagingsci.technol.2021.65.5.050402

Abstract

Image captioning generates text that describes scenes from input images. It has been developed for high-quality images taken in clear weather. However, in bad weather conditions, such as heavy rain, snow, and dense fog, poor visibility as a result of rain streaks, rain accumulation, and snowflakes causes a serious degradation of image quality. This hinders the extraction of useful visual features and results in deteriorated image captioning performance. To address practical issues, this study introduces a new encoder for captioning heavy rain images. The central idea is to transform output features extracted from heavy rain input images into semantic visual features associated with words and sentence context. To achieve this, a target encoder is initially trained in an encoder-decoder framework to associate visual features with semantic words. Subsequently, the objects in a heavy rain image are rendered visible by using an initial reconstruction subnetwork (IRS) based on a heavy rain model. The IRS is then combined with another semantic visual feature matching subnetwork (SVFMS) to match the output features of the IRS with the semantic visual features of the pretrained target encoder. The proposed encoder is based on the joint learning of the IRS and SVFMS. It is trained in an end-to-end manner, and then connected to the pretrained decoder for image captioning. It is experimentally demonstrated that the proposed encoder can generate semantic visual features associated with words even from heavy rain images, thereby increasing the accuracy of the generated captions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

New Encoder Learning for Captioning Heavy Rain Images via Semantic Visual Feature Matching

Abstract

Talk to us

Similar Papers

More From: Journal of Imaging Science and Technology

Lead the way for us

Journal: Journal of Imaging Science and Technology	Publication Date: Sep 1, 2021
Citations: 2

Similar Papers

Chinese Image Caption Generation via Visual Attention and Topic Modeling.
Maofu Liu ... Lingjun Li
IEEE Transactions on Cybernetics | VOL. 52
Maofu Liu, et. al.Maofu Liu ... Lingjun Li
22 Jun 2020
IEEE Transactions on Cybernetics | VOL. 52

Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning
Xiaoming Liu ... Guan Yang
Electronics | VOL. 13
Xiaoming Liu, et. al.Xiaoming Liu ... Guan Yang
18 May 2024
Electronics | VOL. 13

Multimodal Transformer With Multi-View Visual Representation for Image Captioning
Jun Yu ... Qingming Huang
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 30
Jun Yu, et. al.Jun Yu ... Qingming Huang
25 Oct 2019
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 30

Multimedia content analytics with modality transition
Ziwei Wang
-
Ziwei WangZiwei Wang
30 Jul 2021
30 Jul 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

New Encoder Learning for Captioning Heavy Rain Images via Semantic Visual Feature Matching

Abstract

Talk to us

Similar Papers

More From: Journal of Imaging Science and Technology