Recurrent Neural Network-Based Video Compression

Zahra Montajabi,Nizar Bouguila,Vahid Khorasani Ghassab

doi:10.1109/icmla55696.2022.00154

Abstract

Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.

Full Text