Spatio-Temporal Video Denoising Based on Attention Mechanism

Kai Ji,Wei Zhang,Weimin Lei

doi:10.1142/s0218001423550066

Abstract

The demands of high-quality videos captured by camera become bigger due to the rapid development of pattern recognition and artificial intelligence. Video denoising is the key technology to obtain clear videos. However, the research on video denoising is far from enough now. In this paper, we propose a video denoising method based on convolutional neural network architecture to reduce the noise from the sensor system. We improve the loss function of noise estimation by imposing adaptive penalty on under-estimation error of noise level which makes our method perform robustly. Furthermore, we make use of multi-level features to guide the spatial denoising, where multilayer semantic information of the image is regarded as the perceptual loss. Instead of relying on Optical Flow solving the characterization of inter-frame information, we utilize U-Net-like structure to handle motion implicitly. It is less computationally expensive and avoids distortions caused by inaccurate flow and object occlusion. In order to locate temporal features and suppress useless information, the attention mechanism is introduced to the skip connections of the U-Net-like structure. Experimental results demonstrate that the proposed algorithm outputs more convincing results in both peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) indexes when processing Gaussian noise, synthetic real noise, and real noise compared with selected approaches.

Full Text