Complex overlapping pedestrian target detection network based on the yolov3 model

Yuchi Zhang

doi:10.62051/rpbbxx55

Abstract

This paper proposes a complex overlapping pedestrian target detection model based on yolov3 model by multi-scale feature fusion and context-aware mechanism. The SONY A7R3a camera shot the model on campus, and the data set was obtained after editing and collating. There were 358 high-definition videos with a resolution of 1920*1080, and the frame rate was 50HZ, about 179,000 frames. Through testing, this paper finds that compared with Single Shot Multibox Detector (SSD), the detection accuracy of the newly proposed model is slightly improved, the detection accuracy is the same as that of Faster R-CNN, and the detection accuracy of the newly proposed model is slightly worse than that of RetinaNet. However, the detection speed of Yolov3 is more than twice that of Single Shot Multibox Detector, RetinaNet and Faster R-CNN. The input size of Yolov3 is 320*320, and the processing of a single image only needs 22ms, so the detection speed of the simplified Yolov3 tiny is faster.

Full Text