Abstract

There are many small objects in traffic scenes, but due to their low resolution and limited information, their detection is still a challenge. Small object detection is very important for the understanding of traffic scene environments. To improve the detection accuracy of small objects in traffic scenes, we propose a small object detection method in traffic scenes based on attention feature fusion. First, a multi-scale channel attention block (MS-CAB) is designed, which uses local and global scales to aggregate the effective information of the feature maps. Based on this block, an attention feature fusion block (AFFB) is proposed, which can better integrate contextual information from different layers. Finally, the AFFB is used to replace the linear fusion module in the object detection network and obtain the final network structure. The experimental results show that, compared to the benchmark model YOLOv5s, this method has achieved a higher mean Average Precison (mAP) under the premise of ensuring real-time performance. It increases the mAP of all objects by 0.9 percentage points on the validation set of the traffic scene dataset BDD100K, and at the same time, increases the mAP of small objects by 3.5%.

Highlights

  • In traffic scenes, the visual perception technology of intelligent vehicles can help automatic driving systems to perceive complex environments accurately and in time, which is a requirement for avoiding collisions and for safe driving

  • The high accuracy and fast real-time performance of object detection algorithms are very important for the safety and real-time control of autonomous vehicles

  • We presented a small object detection method for traffic scenes based on attention feature fusion for autonomous driving systems as an improvement to the YOLOv5s architecture

Read more

Summary

Introduction

The visual perception technology of intelligent vehicles can help automatic driving systems to perceive complex environments accurately and in time, which is a requirement for avoiding collisions and for safe driving. The prevailing deep learning-based object detection algorithms, such as YOLOv5 [7], treat each region of the whole feature map by default, that is, each region has the same contribution to the final detection result This means that they do not weigh the convolution features extracted from the network according to their position and importance. In response to the above problems, in this paper, we first propose an MS-CAB to alleviate the problems caused by scale changes to small object detection This block effectively improves the feature inconsistency between objects at different scales, and at the same time, focuses attention on the objects in the area that need to be focused on, which reduces the unnecessary shallow feature information of the background. The paper ends with our conclusions and suggestions for future work

Object Detection
Attention Mechanism
Feature Fusion
The YOLOv5s Benchmark Model
Multi-Scale Channel Attention Block
Attention
Datasets and Experimental Settings
Quantitative Result Analysis
Comparative Analysis of Detection Results
Findings
Conclusions and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call