Abstract

We present a general framework which can handle different processing stages of the three-dimensional (3D) scene representation referred to as “view-plus-depth” (V+Z). The main component of the framework is the relation between the depth map and the super-pixel segmentation of the color image. We propose a hierarchical super-pixel segmentation which keeps the same boundaries between hierarchical segmentation layers. Such segmentation allows for a corresponding depth segmentation, decimation and reconstruction with varying quality and is instrumental in tasks such as depth compression and 3D data fusion. For the latter we utilize a cross-modality reconstruction filter which is adaptive to the size of the refining super-pixel segments. We propose a novel depth encoding scheme, which includes specific arithmetic encoder and handles misalignment outliers. We demonstrate that our scheme is especially applicable for low bit-rate depth encoding and for fusing color and depth data, where the latter is noisy and with lower spatial resolution.

Highlights

  • Representation and processing of real-world three-dimensional (3D) visual scenes has been of increasing interest recently in the light of new forms of immersive visualization achieved by the advancement of 3D display technology

  • Depth maps are combined with confocal captures of 2D color images to form a 3D representation, referred to as ‘‘View-plus-depth’’ (V+Z) [1], [2], where both images have the same size and are pixel-topixel aligned to augment each color pixel with its position in space

  • Where geometry is represented by disparity maps, we use the percentage of bad pixels (BAD) which shows the percentage of disparities which differ from the ground true disparity map by more than one pixel [8]

Read more

Summary

INTRODUCTION

Representation and processing of real-world three-dimensional (3D) visual scenes has been of increasing interest recently in the light of new forms of immersive visualization achieved by the advancement of 3D display technology. In ToF approaches, depth data is limited by the low sensor resolution, e.g. 120 × 160 [9] It is constrained by the requirement of the photo-elements to work in high-sensitivity conditions, which is ensured by increasing the sensing element area. We consider a case, where the depth comes as low-resolution, noise-degraded map and the task is to restore it to its full resolution Such case is instrumental in non-confocal ToF/color data fusion systems. B. 3D FUSION OF ASYMMETRIC VIEW-PLUS-DEPTH DATA 3D data fusion problem has been considered in different research settings aiming at aligning the edges of the two modalities while enforcing piecewise smoothness of the depth. Asymmetric V+Z capturing setup, where depth maps are obtained by noisy ToF sensor

RELATION WITH PREVIOUS WORK
SUPER-PIXEL CLUSTERING
MULTI-LAYER CONGRUENT SUPER-PIXEL CLUSTERING
DEPTH RECONSTRUCTION
EXPERIMENTAL RESULTS
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call