Thermal imaging pedestrian detection algorithm based on attention guidance and local cross-level network

Lixian Yu,Xuesong Sun,Shipeng Han,Yanni Wang

doi:10.1117/1.jei.30.5.053012

Abstract

Pedestrian-related accidents are more frequent at night when visible (VI) cameras are inefficient. Compared with VI cameras, thermal cameras work better in this particular environment. Conversely, thermal images have several drawbacks, such as high noise, low-resolution, less detailed information, and susceptibility to ambient temperature. To overcome these shortcomings, an improved algorithm based on you only look once version 3 (YOLOv3) is proposed. First, the number and size of anchors are obtained using k-means++, which makes the shape of the anchors more suitable for detecting the target. Second, the attention module is added to the backbone network, which is helpful with extracting better feature maps from low-quality thermal images. Finally, the improved atrous spatial pyramid pooling module is added to the back of the backbone network to enable the extracted feature maps to contain more multi-scale information and context information. Experiments on the computer vision center-09 dataset show that the average precision is 86.1%, which is 3.5% higher than YOLOv3 and 0.8% higher than YOLOv4. The detection speed reaches 48 FPS. The results show that the improved algorithm has good accuracy and generalization.

Full Text