Depth Image Based Rendering Research Articles

Algorithms for converting 2D to 3D are gaining importance following the hiatus brought about by the discontinuation of 3D TV production; this is due to the high availability and popularity of virtual reality systems that use stereo vision. In this paper, several depth image-based rendering (DIBR) approaches using state-of-the-art single-frame depth generation neural networks and inpaint algorithms are proposed and validated, including a novel very fast inpaint (FAST). FAST significantly exceeds the speed of currently used inpaint algorithms by reducing computational complexity, without degrading the quality of the resulting image. The role of the inpaint algorithm is to fill in missing pixels in the stereo pair estimated by DIBR. Missing estimated pixels appear at the boundaries of areas that differ significantly in their estimated distance from the observer. In addition, we propose parameterizing DIBR using a singular, easy-to-interpret adaptable parameter that can be adjusted online according to the preferences of the user who views the visualization. This single parameter governs both the camera parameters and the maximum binocular disparity. The proposed solutions are also compared with a fully automatic 2D to 3D mapping solution. The algorithm proposed in this work, which features intuitive disparity steering, the foundational deep neural network MiDaS, and the FAST inpaint algorithm, received considerable acclaim from evaluators. The mean absolute error of the proposed solution does not contain statistically significant differences from state-of-the-art approaches like Deep3D and other DIBR-based approaches using different inpaint functions. Since both the source codes and the generated videos are available for download, all experiments can be reproduced, and one can apply our algorithm to any selected video or single image to convert it.

Read full abstract

Recently, CNN-based post-processing has shown great potential in Synthesized View Quality Enhancement (SVQE). However, due to the limited receptive field of convolution, it is ineffective in explicitly modeling long-range dependencies, which are critical to eliminate the distortion induced by Depth Image Based Rendering (DIBR) in synthesized views. Although transformers exhibit tremendous success at learning global contextual information, it is weak at extracting local texture information. To take full advantages of the CNN and transformer, we present a novel U-shaped hybrid transformer with asymmetric flow division to collaboratively capture global-local information for SVQE, termed as AFD-former. Specifically, the AFD-former utilizes the Transformer-CNN Block (TCB) as encoder and decoder, in which several Dynamic Hybrid Attention Blocks (DHABs) are designed to simultaneously model long-range interactions and retain texture details. Then, considering that the deeper layers of the U-shaped network play more roles in capturing global information while shallow layers more in extracting local information, an Asymmetric Flow Division Unit (AFDU) is embedded into each DHAB to assign different contributions of global-local contextual information to the transformer and CNN branches across different layers. Finally, a dynamic learnable modulator is incorporated into two branches to help model effectively feature representation learning. That can be viewed as the dynamic process of adjusting the weight for each channel of the input feature based on contextual cues. Extensive experiments demonstrate that the proposed AFD-former can significantly enhance perceptual quality of synthesized views with similar SVQE speed compared with the related state-of-the-art SVQE methods. The source code will be available at https://github.com/House-yuyu/AFD-former.

Read full abstract

Depth Image Based Rendering Research Articles

Related Topics

Articles published on Depth Image Based Rendering

TRRHA: A two-stream re-parameterized refocusing hybrid attention network for synthesized view quality enhancement

Implementation Paper on Automated 2D Image to 3D Model Generation

Coarse- and Fine-Grained Fusion Hierarchical Network for Hole Filling in View Synthesis.

Cross-Layer Video Synthesizing and Antenna Allocation Scheme for Multi-View Video Provisioning Under Massive MIMO Networks

Adaptable 2D to 3D Stereo Vision Image Conversion Based on a Deep Convolutional Neural Network and Fast Inpaint Algorithm.

AFD-Former: A Hybrid Transformer With Asymmetric Flow Division for Synthesized View Quality Enhancement

As-Deformable-As-Possible Single-Image-Based View Synthesis Without Depth Prior

Multi-layer and Multi-scale feature aggregation for DIBR-Synthesized image quality assessment

A bilateral attention based generative adversarial network for DIBR 3D image watermarking

Quality Prediction of View Synthesis Based on Curriculum-Style Structure Generation

Stereoscopic view synthesis with progressive structure reconstruction and scene constraints.

Deep Learning-Based Synthesized View Quality Enhancement with DIBR Distortion Mask Prediction Using Synthetic Images.

Blind Quality Prediction for View Synthesis Based on Heterogeneous Distortion Perception.

DIBR Zero-Watermarking Based on Invariant Feature and Geometric Rectification

LGGD+: Image Retargeting Quality Assessment by Measuring Local and Global Geometric Distortions

Learning-Based Stereoscopic View Synthesis with Cascaded Deep Neural Networks

Multiple-feature-based zero-watermarking for robust and discriminative copyright protection of DIBR 3D videos

Quality Assessment of View Synthesis Based on Visual Saliency and Texture Naturalness

A comprehensive survey on robust image watermarking

Novel view synthesis in embedded virtual reality devices

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Depth Image Based Rendering Research Articles

Related Topics

Articles published on Depth Image Based Rendering

TRRHA: A two-stream re-parameterized refocusing hybrid attention network for synthesized view quality enhancement

Implementation Paper on Automated 2D Image to 3D Model Generation

Coarse- and Fine-Grained Fusion Hierarchical Network for Hole Filling in View Synthesis.

Cross-Layer Video Synthesizing and Antenna Allocation Scheme for Multi-View Video Provisioning Under Massive MIMO Networks

Adaptable 2D to 3D Stereo Vision Image Conversion Based on a Deep Convolutional Neural Network and Fast Inpaint Algorithm.

AFD-Former: A Hybrid Transformer With Asymmetric Flow Division for Synthesized View Quality Enhancement

As-Deformable-As-Possible Single-Image-Based View Synthesis Without Depth Prior

Multi-layer and Multi-scale feature aggregation for DIBR-Synthesized image quality assessment

A bilateral attention based generative adversarial network for DIBR 3D image watermarking

Quality Prediction of View Synthesis Based on Curriculum-Style Structure Generation

Stereoscopic view synthesis with progressive structure reconstruction and scene constraints.

Deep Learning-Based Synthesized View Quality Enhancement with DIBR Distortion Mask Prediction Using Synthetic Images.

Blind Quality Prediction for View Synthesis Based on Heterogeneous Distortion Perception.

DIBR Zero-Watermarking Based on Invariant Feature and Geometric Rectification

LGGD+: Image Retargeting Quality Assessment by Measuring Local and Global Geometric Distortions

Learning-Based Stereoscopic View Synthesis with Cascaded Deep Neural Networks

Multiple-feature-based zero-watermarking for robust and discriminative copyright protection of DIBR 3D videos

Quality Assessment of View Synthesis Based on Visual Saliency and Texture Naturalness

A comprehensive survey on robust image watermarking

Novel view synthesis in embedded virtual reality devices