Predictive coding of depth images across multiple views

Yannick Morvan,Peter H N De With,Dirk Farin

doi:10.1117/12.704279

Abstract

A 3D video stream is typically obtained from a set of synchronized cameras, which are simultaneously capturing the same scene (multiview video). This technology enables applications such as free-viewpoint video which allows the viewer to select his preferred viewpoint, or 3D TV where the depth of the scene can be perceived using a special display. Because the user-selected view does not always correspond to a camera position, it may be necessary to synthesize a virtual camera view. To synthesize such a virtual view, we have adopted a depth image-based rendering technique that employs one depth map for each camera. Consequently, a remote rendering of the 3D video requires a compression technique for texture and depth data. This paper presents a predictivecoding algorithm for the compression of depth images across multiple views. The presented algorithm provides (a) an improved coding efficiency for depth images over block-based motion-compensation encoders (H.264), and (b), a random access to different views for fast rendering. The proposed depth-prediction technique works by synthesizing/computing the depth of 3D points based on the reference depth image. The attractiveness of the depth-prediction algorithm is that the prediction of depth data avoids an independent transmission of depth for each view, while simplifying the view interpolation by synthesizing depth images for arbitrary view points. We present experimental results for several multiview depth sequences, that result in a quality improvement of up to 1.8 dB as compared to H.264 compression.

Full Text