Abstract

Compression distorted multi-view video plus depth (MVD) should be enhanced at the receiver side without the original signals, especially the depth maps because they describe the positioning information in 3D space and they are important for subsequent virtual view synthesis. However, challenge arises from how to exploit the contribution from multi-modality priors from neighboring viewpoints, and how to handle the gradient vanishing when textureless depth maps are involved. In this paper, we propose a multi-modality residual network to enhance the quality of compressed multi-view depth video. Taking advantage from high correlation among different viewpoints, depth maps from adjacent views are exploited as guidance for the enhancement of depth video in target view. Color frames in target view are also involved to offer the information object contours, obtaining multi-modality guidance. The proposed network is organized a deep residual network to well eliminate distortion and restore details. Because above multi-modality guidance have different correlations with target depth video and not all information can contribute to the enhancement, an adaptive skip structure is designed to further exploit the contribution from different priors appropriately. Experimental results show that our scheme outperforms other benchmarks and achieves an average 1.935 dB and 0.0227 gains on PSNR and SSIM over all test sequences, respectively. All results on objective, subjective and 3D reconstruction suggest that our method is able to provide superiority performance in practical applications.

Highlights

  • Multi-view video plus depth (MVD) is the fundamental data representation of three-dimensional (3D) and interactive visual applications, including super multi-view video, free viewpoint television and virtual reality [1], [2]

  • In this paper, we propose an adaptive multi-modality residual network for depth map enhancement that distorted by compression

  • Depth maps from adjacent views and corresponding color images of target depth maps are taken as multi-modality priors

Read more

Summary

INTRODUCTION

Multi-view video plus depth (MVD) is the fundamental data representation of three-dimensional (3D) and interactive visual applications, including super multi-view video, free viewpoint television and virtual reality [1], [2]. Depth quality enhancement has witnessed a rapid development in these years, and previously proposed filters and methods have made success on this topic [6]–[9] These filters are facing difficulties when compression distortions are involved in depth videos. Learning-based methods have been proposed which can adaptively handle the artifacts in depth maps [10]–[12] These works follow similar structures of networks on color image artifacts, and corresponding color images are usually taken into account as guidance. Since depth maps from adjacent views are with compression distortion as well, color frames in current view are involved to offer the information of object contours These references and guidance are combined together and regarded as multi-modality guidance.

RELATED WORKS
ADAPTIVE MULTI-MODALITY RESIDUAL NETWORK
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call