Abstract

<div>Effective circumstance perception technology is the prerequisite for the successful application of autonomous driving, especially the detection technology of traffic objects that affects other tasks such as driving decisions and motion execution in autonomous vehicles. However, recent studies show that a single sensor cannot perceive the surrounding environment stably and effectively in complex circumstances. In the article, we propose a multi-scale feature fusion framework that exploits a dual backbone network to extract camera and radar feature maps and performs feature fusion on three different feature scales using a new fusion module. In addition, we introduce a new generation mechanism of radar projection images and relabel the nuScenes dataset since there is no other suitable autonomous driving dataset for model training and testing. The experimental results show that the fusion models achieve superior accuracy over visual image-based models on the evaluation criteria of PASCAL visual object classes (VOC) and Common Objects in Context (COCO), about 2% over the baseline model (YOLOX).</div>

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.