This paper proposes a weighted large margin nearest center (WLMNC) distance-based human depth recovery method for tele-immersive video interaction systems with limited bandwidth consumption. In the remote stage, the proposed method highly compresses the depth data of the remote human into skeletal block structures by learning the WLMNC distance, which is equivalent to downsampling the human depth map at $64{\times}$ the sampling rate. In the local stage, the method first recovers a rough human depth map based on a WLMNC distance augmented clustering approach and then obtains a fine depth map based on a rough depth-guided autoregressive model to preserve the depth discontinuities and suppress texture copy artifacts. The proposed WLMNC distance is learned by the large margin clustering problem with a weighted hinge loss to balance the clustering accuracy and depth recovery accuracy and is verified to be able to preserve depth discontinuities between skeletal block structures with occlusion. A theoretical analysis is conducted to verify the effectiveness of using the weighted hinge loss. Furthermore, a novel data set containing various types of human postures with self-occlusion is built to benchmark the human depth recovery methods. The quantitative comparison with the state-of-the-art depth recovery methods on the introduced benchmark data set demonstrates the effectiveness of the proposed method for human depth recovery with such a high upsampling rate.
Read full abstract