To improve the detection accuracy of vehicles and pedestrians in traffic scenes using object detection algorithms, this paper presents modifications, compression, and deployment of the single-stage typical algorithm YOLOv7-tiny. In the model improvement section: firstly, to address the problem of small object missed detection, shallower feature layer information is incorporated into the original feature fusion branch, forming a four-scale detection head; secondly, a Multi-Stage Feature Fusion (MSFF) module is proposed to fully integrate shallow, middle, and deep feature information to extract more comprehensive small object information. In the model compression section: the Layer-Adaptive Magnitude-based Pruning (LAMP) algorithm and the Torch-Pruning library are combined, setting different pruning rates for the improved model. In the model deployment section: the V7-tiny-P2-MSFF model, pruned by 45% using LAMP, is deployed on the embedded platform NVIDIA Jetson AGX Xavier. Experimental results show that the improved and pruned model achieves a 12.3% increase in mAP@0.5 compared to the original model, with parameter volume, computation volume, and model size reduced by 76.74%, 7.57%, and 70.94%, respectively. Moreover, the inference speed of a single image for the pruned and quantized model deployed on Xavier is 9.5 ms.
Read full abstract