Abstract

To tackle the challenges of weak sensing capacity for multi-scale objects, high missed detection rates for occluded targets, and difficulties for model deployment in detection tasks of intelligent roadside perception systems, the PDT-YOLO algorithm based on YOLOv7-tiny is proposed. Firstly, we introduce the intra-scale feature interaction module (AIFI) and reconstruct the feature pyramid structure to enhance the detection accuracy of multi-scale targets. Secondly, a lightweight convolution module (GSConv) is introduced to construct a multi-scale efficient layer aggregation network module (ETG), enhancing the network feature extraction ability while maintaining weight. Thirdly, multi-attention mechanisms are integrated to optimize the feature expression ability of occluded targets in complex scenarios, Finally, Wise-IoU with a dynamic non-monotonic focusing mechanism improves the accuracy and generalization ability of model sensing. Compared with YOLOv7-tiny, PDT-YOLO on the DAIR-V2X-C dataset improves mAP50 and mAP50:95 by 4.6% and 12.8%, with a parameter count of 6.1 million; on the IVODC dataset by 15.7% and 11.1%. We deployed the PDT-YOLO in an actual traffic environment based on a robot operating system (ROS), with a detection frame rate of 90 FPS, which can meet the needs of roadside object detection and edge deployment in complex traffic scenes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call