Abstract

The depth video contains large smooth contents with sharp edges. Since the deep learning-based color video orientated intra prediction methods pay no attention to the characteristics of depth video, they are unsuitable for optimizing the coding efficiency of depth video. In this paper, a multiple resolution prediction method with deep up-sampling is proposed to promote the coding efficiency of depth video. To efficiently encode the depth blocks of different complexity, the depth block is selectively encoded at different resolutions, including <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\times 1$ </tex-math></inline-formula> , <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\times 1$ </tex-math></inline-formula> /2, and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\times 1$ </tex-math></inline-formula> /4 resolutions. If the block is encoded with a low-resolution (LR), the resolution of reconstructed LR depth block is recovered by an up-sampling network. To constrain the quality of both reconstructed high-resolution depth block and its synthesized view, a view synthesis distortion guidance mechanism is proposed for the up-sampling network. In addition, a distillation-based lightweight up-sampling network is proposed to reduce the computational complexity. Experimental results demonstrate that the proposed multiple resolution prediction method obtains an average of 10.84% BD-rate saving in comparison with 3D-HEVC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call