Target detection optimization model based on fine-grained feature fusion

Xianfu Bao,Guangyao Bai,Rui Yang,Zanxia Qiang,Fengjie Cen,Feng Wu

doi:10.1117/12.2611693

Abstract

The paper optimizes the YOLOv4 detection model incorporating on the unmanned vehicle detection scenario. A hybrid attention mechanism combined channel and spatial attention mechanisms is introduced into the backbone network of the detction model. The high level semantic feature map and the shallow fine-grained feature map are merged into the detection neck network. The hybrid attention mechanism strengthens the process of screening the fine-grained features. The effect of using hybrid attention mechanism is better than that of using channel attention mechanism. Through the experiments, the results were drawn: (1) For the CSPDarknet-53 network, the effect of using hybrid attention mechanism is better than that of using channel attention mechanism. (2) On VOC 2007+2012 dataset, the model combined shallow layer feature maps and hybrid attention mechanism in this paper can improve can improve the detection accuracy by about 4%, compared with the original YOLOV4 object detection model. At the same time, for the KITTI dataset images closed to the actual scene, the image are used to verify the actual driving scene. The improved model achieves excellent detection effect on KITTI dataset.

Full Text