Abstract

Pedestrian detection plays a crucial role in ensuring traffic safety within the domain of computer vision. However, accurately detecting pedestrians in complex environments proves to be a challenge due to issues such as occlusion. To address this issue, this paper presents an end-to-end pedestrian detection model founded on the DEtection TRansformer (DETR) architecture, effectively managing occlusion scenarios. The proposed model comprises a backbone Convolutional Neural Network (CNN) and a Transformer network. The backbone CNN incorporates variable convolution and U-Net design principles to enhance feature extraction capabilities, particularly for occluded pedestrians. Additionally, our innovative Adaptive Occlusion-Aware Attention Mechanism (AOAM) is embedded within the Transformer network, allowing the model to dynamically adjust attention weights and enhance the localization and identification of occluded pedestrians. Extensive experiments on the Caltech and ETH datasets demonstrate the superior performance of our model compared to state-of-the-art approaches across four key evaluation metrics. This study provides effective methodologies and theoretical foundations for pedestrian detection in complex environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call