Abstract As a key step in obstacle avoidance and path planning, obstacle detection via camera sensors is crucial for autonomous driving. The real traffic road environment is complex and variable, and the existing obstacle detection algorithms still have the problem of insufficient sensing ability. Therefore, this work suggests a camera sensors-based Strong Sensing DEtection TRansformer (SS-DETR) obstacle detection model for autonomous driving. Firstly, receptive-field attention ResNet is designed to improve feature analysis and extraction performance by considering the importance of receptive field spatial features and channels. Then, an intra-scale feature interaction module based on multiple information fusion attention is created to strengthen the representation of advanced feature maps. Furthermore, the cross-scale feature-fusion module is optimized to extract more detailed information from multi-scale feature maps. Finally, a localization loss function based on L1 and Powerful Intersection over Union v2 is implemented to further boost the detection performance. To verify the efficacy of the suggested model, the KITTI dataset containing camera sensors-based road obstacle images is adopted. The experimental results reveal that compared to real-time DETR, SS-DETR improves mean average precision (mAP)@50:95 and mAP@50 by 2.4% and 1.9%, respectively, and has a real-time inference speed of 33.7 frames per second. To further confirm the generalization ability of the approach, experiments are conducted on the camera sensors-based Cityscapes dataset. The results divulge that the suggested strategy can effectively raise the detection accuracy of obstacles, and offer a fresh perspective on obstacle identification.
Read full abstract