Abstract

Although many deep-learning-based methods have achieved considerable detection performance for pedestrians with high visibility, their overall performances are still far from satisfactory, especially when heavily occluded instances are included. In this research, we have developed a novel pedestrian detector using a deformable attention-guided network (DAGN). Considering that pedestrians may be deformed with occlusions or under diverse poses, we have designed a deformable convolution with an attention module (DCAM) to sample from non-rigid locations, and obtained the attention feature map by aggregating global context information. Furthermore, the loss function was optimized to get accurate detection bounding boxes, by adopting complete-IoU loss for regression, and the distance IoU-NMS was used to refine the predicted boxes. Finally, a preprocessing technique based on tone mapping was applied to cope with the low visibility cases due to poor illumination. Extensive evaluations were conducted on three popular traffic datasets. Our method could decrease the log-average miss rate (MR−2) by 12.44% and 7.8%, respectively, for the heavy occlusion and overall cases, when compared to the published state-of-the-art results of the Caltech pedestrian dataset. Of the CityPersons and EuroCity Persons datasets, our proposed method outperformed the current best results by about 5% in MR−2 for the heavy occlusion cases.

Highlights

  • Pedestrian detection is an essential computer vision problem that is widely utilized in many real-world applications, such as autonomous driving systems, robotics, and security monitoring systems

  • The distance information union (IoU)-based (DIoU) norm loss of the (NMS) was adopted to refine that the proposed method leads to notable improvement in performance for the deprediction boxes to improve the detection performance of occluded instances

  • We explain the details of the experimental setup and evaluation metrics

Read more

Summary

Introduction

Pedestrian detection is an essential computer vision problem that is widely utilized in many real-world applications, such as autonomous driving systems, robotics, and security monitoring systems. Inspired by deep-learning-based techniques of generic object detection, many research works [1,2,3,4,5,6,7] have achieved high detection accuracy for reasonable scale and non-occluded pedestrians. The detection performance is unsatisfactory for the difficult cases, such as crowd scenes, rare pose instances, and poor visibility cases influenced by time of the day or weather. Pedestrians are likely to be occluded by others or by roadside obstructions.

Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.