Abstract

Three-dimensional (3D) video contains plentiful multi-domain correlations, including spatial, temporal, and inter-view correlations. In this paper, a deep multi-domain prediction method is proposed for 3D video coding. Different from previous methods, our proposed method utilizes not only spatial and temporal correlations but also inter-view correlation to obtain a more accurate prediction, and adopts deep convolutional neural networks to effectively fuse multi-domain references. More specifically, a hierarchical prediction mechanism, which includes a spatial-temporal prediction network and a multi-domain prediction network, is designed to overcome the fusion difficulty of multi-domain reference information. Furthermore, a progressive spatial-temporal prediction network and a multi-scale multi-domain prediction network are designed to obtain the spatial-temporal prediction result and multi-domain prediction result respectively. Experimental results show that the proposed method achieves considerable bitrate saving compared with 3D-HEVC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call