3D Video Applications Research Articles

The depth-image-based rendering (DIBR) algorithms used for 3D video applications introduce new types of artifacts mostly located around the disoccluded regions. As the DIBR algorithms involve geometric transformations, most of them introduce non-uniform geometric distortions affecting the edge coherency in the synthesized images. Such distortions are not handled efficiently by the common image quality assessment metrics which are primarily designed for other types of distortions. In order to better deal with specific geometric distortions in the DIBR-synthesized images, we propose a full-reference metric based on multi-scale image decomposition applying morphological filters. Using non-linear morphological filters in multi-scale image decomposition, important geometric information such as edges is maintained across different resolution levels. Edge distortion between the multi-scale representation subbands of the reference image and the DIBR-synthesized image is measured precisely using mean squared error. In this way, areas around edges that are prone to synthesis artifacts are emphasized in the metric score. Two versions of morphological multiscale metric have been explored: (a) Morphological Pyramid Peak Signal-to-Noise Ratio metric (MP-PSNR) based on morphological pyramid decomposition, and (b) Morphological Wavelet Peak Signal-to-Noise Ratio metric (MW-PSNR) based on morphological wavelet decomposition. The performances of the proposed metrics have been tested using two databases which contain DIBR-synthesized images: the IRCCyN/IVC DIBR image database and MCL-3D stereoscopic image database. Proposed metrics achieve significantly higher correlation with human judgment compared to the state-of-the-art image quality metrics and compared to the tested metric dedicated to synthesis-related artifacts. The proposed metrics are computationally efficient given that the morphological operators involve only integer numbers and simple computations like min, max, and sum as well as simple calculation of MSE. MP-PSNR has slightly better performances than MW-PSNR. It has very good agreement with human judgment, Pearson’s 0.894, Spearman 0.77 when it is tested on the MCL-3D stereoscopic image database. We have demonstrated that PSNR has particularly good agreement with human judgment when it is calculated between images at higher scales of morphological multi-scale representations. Consequently, simplified and in essence reduced versions of multi-scale metrics are proposed, taking into account only detailed images at higher decomposition scales. The reduced version of MP-PSNR has very good agreement with human judgment, Pearson’s 0.904, Spearman 0.863 using IRCCyN/IVC DIBR image database.

Read full abstract

With the growing demand for 3D video, efforts are underway to incorporate it in the next generation of broadcast and streaming applications and standards. Scalability is one possible solution to reduce the amount of data in multi-view/3D video in heterogeneous environments. But using Scalable Multi-view Video Coding (SMVC) for multi-view/3D video still has many unresolved challenges. In this thesis, we propose an adaptive framework to use SMVC in various 3D video applications effectively. For this issue, first, the proper scalable modality should be selected according to the application at hand, its related features and requirements. To the best of our knowledge, no work has systematically defined new and proper scalable modalities specifically for multi-view 3D video, so far. Hence, at the first step of the proposed framework, we will suggest a methodology to extract the proper scalable modalities for multi-view/3D video. In addition, while SMVC can help support heterogeneous receivers, the question becomes: how to scale the 3D video content in a given type of scalability and a specified application in order to achieve the highest performance and satisfy the receivers? constraint as much as possible? In other words, the proper mechanism to assign SMVC data to various layers should be clearly determined. This issue is considered as the second step of our proposed framework. This method uses the inter-layer and intralayer disparity concepts. Note that specific features of any given scalable modality should be used to define these concepts in that specific scalable modality. Simulation results indicate that the proposed method achieves relatively better compression rate for each layer, with much less overhead. At the next step of our proposed framework, we propose an analytical view-level rate model for multiview video coding. Our rate model takes into account both previous theoretical results as well as new results specifically obtained for multi-view video and confirmed by comprehensive practical experiments. Simulation results show that our model can predict the rate of each view with relatively high precision and a low estimation error of 12% on average for tested sequences. In addition, the evaluation of the overall visual quality of scalable multi-view video requires a new objective perceptual quality measure specifically designed for scalable multi-view/3D video. Although several subjective and objective quality assessment methods have been proposed for multi-view/3D sequences, no comparable attempt has been made for quality assessment of scalable multi-view/3D video so far. Hence, in this framework, we propose a new methodology to build suitable objective quality assessment metrics for different scalable modalities in multi-view/3D video. Our proposed methodology considers the importance of each layer and its content as a quality of experience factor in the overall quality. Furthermore, in addition to the quality of each layer, the concept of inter-layer and intra-layer disparity is considered as an effective feature to evaluate overall perceived quality more accurately. Our simulation results indicate that the correlation coefficient between our extracted objective quality evaluation metric and subjective quality assessment is 0.8 on average for tested video sequences. At the last step of our proposed framework, we present a novel method for rate-distortion optimization in scalable multi-view video that tries to minimize the perceptual distortion of decoded video under the conditions that the sum of bits generated from different views is constrained within a given bit budget. Since the constraint-based optimization problem is usually computational intensive, our proposed approach considers the concept of intra-layer and inter-layer disparity to reduce this computational complexity. Experimental results show that the proposed approach uses on average 24% and 42% less bitrate than the H.264/AVC rate-distortion optimization for base and base plus enhancement layers, respectively. Although the thesis is in Farsi (Persian), the following English papers capture most of its essence: H. Roodaki, M.R. Hashemi, and S. Shirmohammadi, ?A New Methodology to Derive Objective Quality Assessment Metrics for Scalable Multi-view 3D Video Coding?, ACM Transactions on Multimedia Computing, Communications, and Applications, Vol. 8, No. 3S, Article 44, September 2012, 25 pages. DOI: 10.1145/2348816.2348823 H. Roodaki, M.R. Hashemi, and S. Shirmohammadi, ?Rate-Distortion Optimization for Scalable Multi-View Video Coding?, Proc. IEEE International Conference on Multimedia and Expo, Chengdu, China, July 14-18 2014, 6 pages. DOI: 10.1109/ICME.2014.6890275 H. Roodaki, Z. Iravani, M.R. Hashemi, S. Shirmohammadi, and M. Gabbouj, ?A New Rate Distortion Model for Multi-View/3D Video Coding?, Proc. IEEE International Workshop on Hot Topics in 3D, in Proc. IEEE International Conference on Multimedia and Expo, July 15-19 2013, San Jose, USA, 6 pages. DOI: 10.1109/ICMEW.2013.6618338 H. Roodaki, M.R. Hashemi, and S. Shirmohammadi, ?New Scalable Modalities in Multi-view 3D Video?, Proc. ACM Workshop on Mobile Video, Oslo, Norway, February 27 2013, pp. 25-30. DOI: 10.1145/2457413.2457420

Read full abstract

3D Video Applications Research Articles

Related Topics

Articles published on 3D Video Applications

Depth Perception Assessment of 3D Videos Based on Stereoscopic and Spatial Orientation Structural Features

As-Deformable-As-Possible Single-Image-Based View Synthesis Without Depth Prior

Multi-layer and Multi-scale feature aggregation for DIBR-Synthesized image quality assessment

Quality Assessment of View Synthesis Based on Visual Saliency and Texture Naturalness

Quality assessment of DIBR-synthesized views: An overview

Perceptual quality assessment of 3D videos with stereoscopic degradations

Efficient Shape Coding for Object-Based 3D Video Applications

Network-Assisted Neural Adaptive Naked-Eye 3D Video Streaming Over Wireless Networks

Combining Local and Global Measures for DIBR-Synthesized Image Quality Evaluation.

Reliable 3D video streaming considering region of interest

High-resolution depth map generator for 3D video applications using time-of-flight cameras

DIBR-synthesized image quality assessment based on morphological multi-scale approach

Depth Intra Coding for 3D Video Based on Geometric Primitives

HEVC-based 3D holoscopic video coding using self-similarity compensated prediction

Multi–Scale Synthesized View Assessment Based on Morphological Pyramids

Multi-directional Hole Filling Method for Virtual View Synthesis

Depth-Based Texture Coding in AVC-Compatible 3D Video Coding

Hoda Roodaki

Learning-based saliency model with depth information.

Fractal Depth Map Sequence Coding Algorithm with Motion-vector-field-based Motion Estimation

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

3D Video Applications Research Articles

Related Topics

Articles published on 3D Video Applications

Depth Perception Assessment of 3D Videos Based on Stereoscopic and Spatial Orientation Structural Features

As-Deformable-As-Possible Single-Image-Based View Synthesis Without Depth Prior

Multi-layer and Multi-scale feature aggregation for DIBR-Synthesized image quality assessment

Quality Assessment of View Synthesis Based on Visual Saliency and Texture Naturalness

Quality assessment of DIBR-synthesized views: An overview

Perceptual quality assessment of 3D videos with stereoscopic degradations

Efficient Shape Coding for Object-Based 3D Video Applications

Network-Assisted Neural Adaptive Naked-Eye 3D Video Streaming Over Wireless Networks

Combining Local and Global Measures for DIBR-Synthesized Image Quality Evaluation.

Reliable 3D video streaming considering region of interest

High-resolution depth map generator for 3D video applications using time-of-flight cameras

DIBR-synthesized image quality assessment based on morphological multi-scale approach

Depth Intra Coding for 3D Video Based on Geometric Primitives

HEVC-based 3D holoscopic video coding using self-similarity compensated prediction

Multi–Scale Synthesized View Assessment Based on Morphological Pyramids

Multi-directional Hole Filling Method for Virtual View Synthesis

Depth-Based Texture Coding in AVC-Compatible 3D Video Coding

Hoda Roodaki

Learning-based saliency model with depth information.

Fractal Depth Map Sequence Coding Algorithm with Motion-vector-field-based Motion Estimation