Enhancing 3D video watching experiences: Tackling compression and 3D warping distortions in synthesized view with perceptual guidance

Huan Zhang,Xu Zhang,Linwei Zhu,Yun Zhang,Jiangzhong Cao,Wing-Kuen Ling

doi:10.1016/j.eswa.2024.125853

Huan Zhang, Xu Zhang + Show 4 more

https://doi.org/10.1016/j.eswa.2024.125853

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

In 3D video systems, synthesized videos are typically rendered using view synthesis technology, mainly Depth Image Based Rendering (DIBR) technology, and suffer from both compression and 3D warping artifacts, which may degrade the perceptual quality of 3D video. Taking into account human perceptual characteristics towards synthesized views, wherein individuals readily discern DIBR distortion, such as cracks and irregular stretching, more attention should be paid to addressing DIBR distortion for Synthesized View Quality Enhancement (SVQE). In this paper, we propose a Distortion Map-guided Asymmetrical encoder–decoder restoration Network for SVQE, termed DMANet, which prioritizes human perceptual factors while maintaining a delicate balance between effectiveness and efficiency. Specifically, to consider the perceptual characteristics, a distortion-aware module is introduced by embedding the predicted DIBR distortion into the restoration network through multi-scale feature embedding, and collaborates with the DIBR distortion prediction loss to focus more on the DIBR-distorted regions. Meanwhile, to promote the efficiency of the U-shape network, an asymmetrical encoder–decoder restoration network is proposed, where the encoder progressively integrates both transformer and CNN modules for facilitating local–global feature extraction, while the decoder is configured with only the CNN module. Furthermore, hybrid transformer-based modules incorporate channel attention interaction and convolutional filters to fully exploit the channel-wise global modeling ability of self-attention while preserving local details. Substantial experimental results show that the proposed DMANet can outperform SOTA SVQE methods and is comparable to SOTA image restoration methods with fewer model parameters, flops, and shorter running time.

Full Text