Deep Multi-Domain Prediction for 3D Video Coding

Jianjun Lei,Ying Chen,Dengchao Jin,Yanan Shi,Zhaoqing Pan,Dong Liu,Nam Ling

doi:10.1109/tbc.2021.3090261

Abstract

Three-dimensional (3D) video contains plentiful multi-domain correlations, including spatial, temporal, and inter-view correlations. In this paper, a deep multi-domain prediction method is proposed for 3D video coding. Different from previous methods, our proposed method utilizes not only spatial and temporal correlations but also inter-view correlation to obtain a more accurate prediction, and adopts deep convolutional neural networks to effectively fuse multi-domain references. More specifically, a hierarchical prediction mechanism, which includes a spatial-temporal prediction network and a multi-domain prediction network, is designed to overcome the fusion difficulty of multi-domain reference information. Furthermore, a progressive spatial-temporal prediction network and a multi-scale multi-domain prediction network are designed to obtain the spatial-temporal prediction result and multi-domain prediction result respectively. Experimental results show that the proposed method achieves considerable bitrate saving compared with 3D-HEVC.

Full Text