Abstract

Accurate depth cues are crucial for 3D perception tasks, and monocular depth estimation networks are no longer sufficient for realistic scenarios. Currently, the most effective approaches are to introduce depth information from other modalities into the image. Radar has become a popular sensor for fusion with cameras due to its low price and all-weather working characteristics. This paper aims to explore how to more effectively integrate the heterogeneous data of radar point clouds and RGB images to improve the performance of depth estimation. Most of the previous works have not fully exploited the potential of integrating these two modalities, so we propose RCDformer, a novel network based on the transformer architecture that fuses radar-camera for dense depth estimation. Without reducing the receptive field, our approach can fully model the contextual relationships between sensors to reduce the impact of radar noise on overall performance. With the proposed Radar-guided Multi-scale Depth Fusion (RGDF) module, the prior spatial information mapped by the Radar Feature Extractor (RFE) is embedded into a set of multi-scale hierarchical features output by Image Feature Extractor (IFE) via the modified deformable cross-attention, which aims to guide the depth prediction of images. Furthermore, we discover that incorporating the Radar Cross Section (RCS) attribute as an extended channel for the radar map is beneficial for dense depth estimation, which improves the overall performance of our model. We evaluate the proposed method on the nuScenes dataset, and the experiment results show that our method still achieves significant advantages in most metrics compared to the state-of-the-art models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.