To tackle the challenges of weak sensing capacity for multi-scale objects, high missed detection rates for occluded targets, and difficulties for model deployment in detection tasks of intelligent roadside perception systems, the PDT-YOLO algorithm based on YOLOv7-tiny is proposed. Firstly, we introduce the intra-scale feature interaction module (AIFI) and reconstruct the feature pyramid structure to enhance the detection accuracy of multi-scale targets. Secondly, a lightweight convolution module (GSConv) is introduced to construct a multi-scale efficient layer aggregation network module (ETG), enhancing the network feature extraction ability while maintaining weight. Thirdly, multi-attention mechanisms are integrated to optimize the feature expression ability of occluded targets in complex scenarios, Finally, Wise-IoU with a dynamic non-monotonic focusing mechanism improves the accuracy and generalization ability of model sensing. Compared with YOLOv7-tiny, PDT-YOLO on the DAIR-V2X-C dataset improves mAP50 and mAP50:95 by 4.6% and 12.8%, with a parameter count of 6.1 million; on the IVODC dataset by 15.7% and 11.1%. We deployed the PDT-YOLO in an actual traffic environment based on a robot operating system (ROS), with a detection frame rate of 90 FPS, which can meet the needs of roadside object detection and edge deployment in complex traffic scenes.
Read full abstract