Sparse Depth Research Articles

Depth completion is a crucial task in autonomous driving, aiming to convert a sparse depth map into a dense depth prediction. The sparse depth map serves as a partial reference for the actual depth, and the fusion of RGB images is frequently employed to augment the completion process owing to its inherent richness in semantic information. Image-guided depth completion confronts three principal challenges: (1) the effective fusion of the two modalities; (2) the enhancement of depth information recovery; and (3) the realization of real-time predictive capabilities requisite for practical autonomous driving scenarios. In response to these challenges, we propose a concise but high-performing network, named CHNet, to achieve high-performance depth completion with an elegant and straightforward architecture. Firstly, we use a fast guidance module to fuse the two sensor features, harnessing abundant auxiliary information derived from the color space. Unlike the prevalent complex guidance modules, our approach adopts an intuitive and cost-effective strategy. In addition, we find and analyze the optimization inconsistency problem for observed and unobserved positions. To mitigate this challenge, we introduce a decoupled depth prediction head, tailored to better discern and predict depth values for both valid and invalid positions, incurring minimal additional inference time. Capitalizing on the dual-encoder and single-decoder architecture, the simplicity of CHNet facilitates an optimal balance between accuracy and computational efficiency. In benchmark evaluations on the KITTI depth completion dataset, CHNet demonstrates competitive performance metrics and inference speeds relative to contemporary state-of-the-art methodologies. To assess the generalizability of our approach, we extend our evaluations to the indoor NYUv2 dataset, where CHNet continues to yield impressive outcomes. The code of this work will be available at https://github.com/lmomoy/CHNet.

Read full abstract

Depth estimation is a critical problem in robotics applications especially autonomous driving. Currently, depth prediction based on binocular stereo matching and depth completion based on fusion of monocular image and laser point cloud are two mainstream methods. However, the former usually suffers from lack of constraint while building cost volume, and the latter could not be trained in self-supervised way and haven’t utilized the geometric constraint of stereo matching, which we think will further improve the performance. Therefore, we propose a novel multimodal neural network, namely UAMD-Net, for dense depth completion based on fusion of binocular stereo matching and the weak constraint from the sparse point clouds. Specifically, the sparse point clouds are converted to sparse depth map and filled to the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">multimodal feature encoder</i> (MFE) with binocular image, constructing a cross-modal cost volume. Then, it will be further processed by the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">multimodal feature aggregator</i> (MFA) and the depth regression layer. Furthermore, since previous multimodal depth estimation methods ignore the problem of modality dependence, we propose a new training strategy called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">random modality dropout</i> (RMD) which enables the network to be adaptively trained with multiple modality inputs and inference with specific modality inputs. Benefiting from the flexible network structure and adaptive training method, our proposed network can realize unified training under various modality input conditions. Comprehensive experiments conducted on KITTI and DrivingStereo depth completion datasets demonstrate that our method produces robust results and outperforms other state-of-the-art methods.

Read full abstract

Sparse Depth Research Articles

Related Topics

Articles published on Sparse Depth

NDDepth: Normal-Distance Assisted Monocular Depth Estimation and Completion.

AGSPN: Efficient attention-gated spatial propagation network for depth completion

Real-time, dense UAV mapping by leveraging monocular depth prediction with monocular-inertial SLAM

A Transformer-Based Image-Guided Depth-Completion Model with Dual-Attention Fusion Module.

CAFNet: Context aligned fusion for depth completion

LightDepth: A resource efficient depth estimation approach for dealing with ground truth sparsity via curriculum learning

Coplane-constrained sparse depth sampling and local depth propagation for depth estimation

Monocular Depth Estimation Based on Dilated Convolutions and Feature Fusion

NSVDNet: Normalized Spatial-Variant Diffusion Network for Robust Image-Guided Depth Completion

A concise but high-performing network for image guided depth completion in autonomous driving

Deep panoramic depth prediction and completion for indoor scenes

Structure-Aware Cross-Modal Transformer for Depth Completion.

UAMD-Net: A Unified Adaptive Multimodal Neural Network for Dense Depth Completion

Distance Transform Pooling Neural Network for LiDAR Depth Completion.

Radar-Camera Fusion Network for Depth Estimation in Structured Driving Scenes.

DiffSVR: Differentiable Neural Implicit Surface Rendering for Single-View Reconstruction with Highly Sparse Depth Prior

A 256 × 256 LiDAR Imaging System Based on a 200 mW SPAD-Based SoC with Microlens Array and Lightweight RGB-Guided Depth Completion Neural Network.

DesNet: Decomposed Scale-Consistent Network for Unsupervised Depth Completion

Efficient Edge-Preserving Multi-View Stereo Network for Depth Estimation

Non-local affinity adaptive acceleration propagation network for generating dense depth maps from LiDAR.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Sparse Depth Research Articles

Related Topics

Articles published on Sparse Depth

NDDepth: Normal-Distance Assisted Monocular Depth Estimation and Completion.

AGSPN: Efficient attention-gated spatial propagation network for depth completion

Real-time, dense UAV mapping by leveraging monocular depth prediction with monocular-inertial SLAM

A Transformer-Based Image-Guided Depth-Completion Model with Dual-Attention Fusion Module.

CAFNet: Context aligned fusion for depth completion

LightDepth: A resource efficient depth estimation approach for dealing with ground truth sparsity via curriculum learning

Coplane-constrained sparse depth sampling and local depth propagation for depth estimation

Monocular Depth Estimation Based on Dilated Convolutions and Feature Fusion

NSVDNet: Normalized Spatial-Variant Diffusion Network for Robust Image-Guided Depth Completion

A concise but high-performing network for image guided depth completion in autonomous driving

Deep panoramic depth prediction and completion for indoor scenes

Structure-Aware Cross-Modal Transformer for Depth Completion.

UAMD-Net: A Unified Adaptive Multimodal Neural Network for Dense Depth Completion

Distance Transform Pooling Neural Network for LiDAR Depth Completion.

Radar-Camera Fusion Network for Depth Estimation in Structured Driving Scenes.

DiffSVR: Differentiable Neural Implicit Surface Rendering for Single-View Reconstruction with Highly Sparse Depth Prior

A 256 × 256 LiDAR Imaging System Based on a 200 mW SPAD-Based SoC with Microlens Array and Lightweight RGB-Guided Depth Completion Neural Network.

DesNet: Decomposed Scale-Consistent Network for Unsupervised Depth Completion

Efficient Edge-Preserving Multi-View Stereo Network for Depth Estimation

Non-local affinity adaptive acceleration propagation network for generating dense depth maps from LiDAR.