Abstract

In this paper, we propose TSFE-Net, two stream feature extraction networks for active stereo matching. First, we perform extra local contrast normalization (LCN) for dataset due to dependency between speckle intensity and distance. Second, we construct two stream feature extraction layers which consist of convolutional layers and deconvolutional layers in different scales to simultaneously learn the features of the original images and LCN images and aggregate context information to form the left and right features. Third, we convert the obtained depth map into disparity map in virtue of camera parameters to construct a supervised learning model. The TSFE-Net not only solves illumination effects between speckle intensity and distance but also reserves details of the original image. Our dataset are captured by RealSense D435 camera. We research extensive quantitative and qualitative evaluations based on a series of scenes, and achieve the end point error (EPE) accuracy of 0.335 on the TITAN XP platform only for valid pixel. The assessment results show that our network has the ability of real-time deep reconstruction for active pattern.

Highlights

  • D EPTH reconstruction technology is popular in computer vision, which is essential to virtual reality, augmented reality [1], and the fields of vehicle automated driving.Depth reconstruction systems are divided into passive stereo systems and active stereo systems

  • There are some applications based on active stereo system, such as Time of Flight (TOF) [2], Structured Light (SL) [3], and binocular stereo matching

  • In this work, we present two-stream feature extraction networks(TSFE-Net), the depth estimation method based on deep learning for active stereo systems

Read more

Summary

INTRODUCTION

D EPTH reconstruction technology is popular in computer vision, which is essential to virtual reality, augmented reality [1], and the fields of vehicle automated driving. Haojie Zeng et al.: TSFE-Net: Two-Stream Feature Extraction Networks for Active Stereo Matching all(WTA) strategy to choose the optimal disparity. Active binocular reconstruction accurately calculate depth of textureless areas in virtue of speckle information and scene itself information. Active Stereo Net can reconstruct depth of textureless scene based on active illumination. We propose two stream feature extraction networks for active stereo matching(TSFE-Net) based on end-to-end deep learning approach. It extends recent work on self-supervised active stereo net [22] and supervised passive stereo network [23] to achieve active binocular reconstruction. 4.We convert the obtained depth map from RealSense D435 camera into disparity map in virtue of camera parameters to construct an end to end supervised learning model

RELATED WORK
FEATURE EXTRACTION
D W H right feature image
EXPERIMENTS
ABLATION EXPERIMENTS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.