Spatio-Temporal Convolutional Neural Network for Enhanced Inter Prediction in Video Coding.

Philipp Merkle,Martin Winken,Jonathan Pfaff,Heiko Schwarz,Detlev Marpe,Thomas Wiegand

doi:10.1109/tip.2024.3446228

Abstract

This paper presents a convolutional neural network (CNN)-based enhancement to inter prediction in Versatile Video Coding (VVC). Our approach aims at improving the prediction signal of inter blocks with a residual CNN that incorporates spatial and temporal reference samples. It is motivated by the theoretical consideration that neural network-based methods have a higher degree of signal adaptivity than conventional signal processing methods and that spatially neighboring reference samples have the potential to improve the prediction signal by adapting it to the reconstructed signal in its immediate vicinity. We show that adding a polyphase decomposition stage to the CNN results in a significantly better trade-off between computational complexity and coding performance. Incorporating spatial reference samples in the inter prediction process is challenging: The fact that the input of the CNN for one block may depend on the output of the CNN for preceding blocks prohibits parallel processing. We solve this by introducing a novel signal plane that contains specifically constrained reference samples, enabling parallel decoding while maintaining a high compression efficiency. Overall, experimental results show average bit rate savings of 4.07% and 3.47% for the random access (RA) and low-delay B (LB) configurations of the JVET common test conditions, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Spatio-Temporal Convolutional Neural Network for Enhanced Inter Prediction in Video Coding.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

Lead the way for us

Journal: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society	Publication Date: Jan 1, 2024
License type: CC BY 4.0

Similar Papers

Intra-Inter Prediction for Versatile Video Coding Using a Residual Convolutional Neural Network
Philipp Merkle ... Heiko Schwarz
-
Philipp Merkle, et. al.Philipp Merkle ... Heiko Schwarz
16 Oct 2022
16 Oct 2022

Improved CNN-Based Learning of Interpolation Filters for Low-Complexity Inter Prediction in Video Coding
Luka Murn ... Marta Mrak
IEEE Open Journal of Signal Processing | VOL. 2
Luka Murn, et. al.Luka Murn ... Marta Mrak
01 Jan 2020
IEEE Open Journal of Signal Processing | VOL. 2

Reference Clip for Inter Prediction in Video Coding
Changyue Ma ... Xiulian Peng
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 29
Changyue Ma, et. al.Changyue Ma ... Xiulian Peng
12 Oct 2018
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 29

Optimized Spatial Recurrent Network for Intra Prediction in Video Coding
Yueyu Hu ... Jiaying Liu
-
Yueyu Hu, et. al.Yueyu Hu ... Jiaying Liu
01 Dec 2018
01 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spatio-Temporal Convolutional Neural Network for Enhanced Inter Prediction in Video Coding.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society