Learning to compress videos without computing motion

Meixu Chen,Todd Goodall,Anjul Patney,Alan C Bovik

doi:10.1016/j.image.2022.116633

Meixu Chen, Todd Goodall + Show 2 more

Open Access

https://doi.org/10.1016/j.image.2022.116633

Copy DOI

Abstract

Video has become an increasingly important part of our daily digital communication. With the development of higher resolution contents and displays, its significant volume poses significant challenges to the goals of acquiring, transmitting, compressing and displaying high quality video content. In this paper, we propose a new deep learning video compression architecture that does not require motion estimation, which is the most expensive element of modern hybrid video compression codecs like H.264 and HEVC. Our framework exploits the regularities inherent to video motion, which we capture by using displaced frame differences as video representations to train the neural network. In addition, we propose a new space–time reconstruction network based on both an LSTM model and a UNet model, which we call LSTM-UNet. The combined network is able to efficiently capture both temporal and spatial video information, making it highly amenable for our purposes. The new video compression framework has three components: a Displacement Calculation Unit (DCU), a Displacement Compression Network (DCN), and a Frame Reconstruction Network (FRN), all of which are jointly optimized against a single perceptual loss function. The DCU removes the need for motion estimation found in hybrid codecs, and is less expensive. In the DCN, an RNN-based network is utilized to compress displaced frame differences as well as retain temporal information between frames. The LSTM-UNet is used in the FRN to learn space time differential representations of videos. Our experimental results show that our compression model, which we call the MOtionless VIdeo Codec (MOVI-Codec), learns how to efficiently compress videos without computing motion. Our experiments show that MOVI-Codec outperforms the Low-Delay P (LDP) veryfast setting of the video coding standard H.264 and exceeds the performance of the modern global standard HEVC codec, using the same setting, as measured by MS-SSIM, especially on higher resolution videos. In addition, our network outperforms the latest H.266 (VVC) codec at higher bitrates, when assessed using MS-SSIM, on high resolution videos. The MOVI-Codec project page can be found at https://github.com/Meixu-Chen/MOVI-Codec.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Signal Processing: Image Communication	Publication Date: Jan 6, 2022
Citations: 5	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Learning to compress videos without computing motion

Abstract

Talk to us

Similar Papers

More From: Signal Processing: Image Communication

Lead the way for us

Similar Papers

MOVI-Codec: Deep Video Compression without Motion
Meixu Chen ... Anjul Patney
-
Meixu Chen, et. al.Meixu Chen ... Anjul Patney
01 Jun 2021
01 Jun 2021

Foveated MOVI-Codec: Foveation-based Deep Video Compression without Motion
Meixu Chen ... Alan C Bovik
-
Meixu Chen, et. al.Meixu Chen ... Alan C Bovik
26 Jun 2022
26 Jun 2022

A new framework for noise-resistant video compression using motion-compensated prediction
P.B Penafiel ... N.M Namazi
-
P.B Penafiel, et. al.P.B Penafiel ... N.M Namazi
09 May 1995
09 May 1995

Learning for Video Compression With Recurrent Auto-Encoder and Recurrent Probability Model
Ren Yang ... Radu Timofte
IEEE Journal of Selected Topics in Signal Processing | VOL. 15
Ren Yang, et. al.Ren Yang ... Radu Timofte
21 Dec 2020
IEEE Journal of Selected Topics in Signal Processing | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning to compress videos without computing motion

Abstract

Talk to us

Similar Papers

More From: Signal Processing: Image Communication