Abstract

To address the problem that target detection models such as FASTER RCNN, YOLO and SSD focus too much on the depth of the network and neglect to make full use of the deep semantic feature information of the image, this paper proposes a new network: AM-YOLO. The network makes full use of contextual relationship between shallow and deep layers to achieve multi-feature fusion of the target. In AM-YOLO, SE blocks are firstly added in the backbone network to differentiate the channel importance of feature maps. Then a new path aggregation network is proposed to achieve the full fusion of shallow and deep features. This paper uses YOLOV4 as the baseline, PASCAL VOC07+12 for dataset and the experimental results show that on the 3060 GPU, the map of AM-YOLO is improved by 2.86% compared with YOLOV4 model, which validates the comprehensive performance of AM-YOLO.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call