Improved YOLOv3 model with feature map cropping for multi-scale road object detection

Lingzhi Shen,Yuanzhi Ni,Yue Wang,Vladimir Stojanovic,Hongfeng Tao

doi:10.1088/1361-6501/acb075

Lingzhi Shen, Yuanzhi Ni + Show 3 more

https://doi.org/10.1088/1361-6501/acb075

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Road object detection is an essential and imperative step for driving intelligent vehicles. Generally, road objects, such as vehicles and pedestrians, present the characteristic of multi-scale and uncertain distribution which puts a high demand on the detection algorithm. Therefore, this paper proposes a YOLOv3 (You Only Look Once v3)-based method aimed at enhancing the capability of cross-scale detection and focusing on the valuable area. The proposed method fills an urgent need for multi-scale detection, and its individual components will be useful in road object detection. The K-means-GIoU algorithm is designed to generate a priori boxes whose shapes are close to real boxes. This greatly reduces the complexity of training, paving the way for fast convergence. Then, a detection branch is added to detect small targets, and a feature map cropping module is introduced into the newly added detection branch to remove the areas with high probability of background targets and easy-to-detect targets, and the cropped areas of the feature map are filled with a value of 0. Further, a channel attention module and spatial attention module are added to strengthen the network’s attention to major regions. The experiment results on the KITTI dataset show that the proposed method maintains a fast detection speed and increases the mAP (mean average precision) value by as much as 2.86 compared with YOLOv3-ultralytics, and especially improves the detection performance for small-scale objects.

Full Text