Research On Traffic Sign Detection Based on Transformer and Yolov5

Zhonghao Xie

doi:10.54097/gayy8z84

Abstract

YOLOX-Swin algorithm takes Swin-Transformer as the backbone network of YOLOX to extract traffic sign image features, obtain enough global context information through mobile window, and use multiple self-attention mechanism; use YOLOX own path enhanced feature pyramid network to extract and integrate multi-scale feature information including lower information of traffic sign to improve the detection accuracy of small target traffic sign. Because the small target traffic signs occupy fewer pixels in the image, and considering that Transformer requires more training samples than the convolutional network, the original copy and paste method is improved to increase the number of traffic signs samples to further improve the accuracy of object detection. The test results on the TT100K dataset show that the proposed object detection method has higher object detection accuracy than several other methods, and can meet the requirements of accuracy and real-time object detection.

Full Text