Remote Dynamic Three-Dimensional Scene Reconstruction

You Yang,Yue Gao,Rongrong Ji,Qiong Liu

doi:10.1371/journal.pone.0055586

Abstract

Remote dynamic three-dimensional (3D) scene reconstruction renders the motion structure of a 3D scene remotely by means of both the color video and the corresponding depth maps. It has shown a great potential for telepresence applications like remote monitoring and remote medical imaging. Under this circumstance, video-rate and high resolution are two crucial characteristics for building a good depth map, which however mutually contradict during the depth sensor capturing. Therefore, recent works prefer to only transmit the high-resolution color video to the terminal side, and subsequently the scene depth is reconstructed by estimating the motion vectors from the video, typically using the propagation based methods towards a video-rate depth reconstruction. However, in most of the remote transmission systems, only the compressed color video stream is available. As a result, color video restored from the streams has quality losses, and thus the extracted motion vectors are inaccurate for depth reconstruction. In this paper, we propose a precise and robust scheme for dynamic 3D scene reconstruction by using the compressed color video stream and their inaccurate motion vectors. Our method rectifies the inaccurate motion vectors by analyzing and compensating their quality losses, motion vector absence in spatial prediction, and dislocation in near-boundary region. This rectification ensures the depth maps can be compensated in both video-rate and high resolution at the terminal side towards reducing the system consumption on both the compression and transmission. Our experiments validate that the proposed scheme is robust for depth map and dynamic scene reconstruction on long propagation distance, even with high compression ratio, outperforming the benchmark approaches with at least 3.3950 dB quality gains for remote applications.

Highlights

Depth maps are crucial for three-dimensional (3D) imaging and display, which have been widely deployed in 3D virtual scene perception [1,2], 3D shape analysis [3], and 3D reconstructions of cells, objects [4,5,6], and organs [7]
After all motion vectors (MVs) are extracted from compressed video bitstream, depth map is mainly reconstructed by Equation 1, and the result is shown by Figure 7(A)
Some holes in white appear in the reconstructed depth map, because these holes are encoded in the Intra mode, and MV is absent for reconstruction

Summary

Introduction

Depth maps are crucial for three-dimensional (3D) imaging and display, which have been widely deployed in 3D virtual scene perception [1,2], 3D shape analysis [3], and 3D reconstructions of cells, objects [4,5,6], and organs [7]. The widely used RGB-D [12,17] (e.g., Kinect) and ToF [13,18] cameras can only capture the depth map in video-rate with very low resolution, e.g., 320|240 pixels, which are very difficult to achieve higher resolution, e.g., standard definition or even higher. Another issue raises from the extra bandwidth cost by sending the depth map beside the color video stream for telecommunication. In most cases, only color video are compressed and transmitted to the terminal side for dynamic 2D scene presentation rather than 3D reconstruction

Methods

Results

Conclusion