Abstract

At present, the detection of instances in traffic scenes based on deep learning is mainly divided into two mainstream directions: object detection and semantic segmentation. Among them, object detection realizes the specific location of a single object in the road scene, and semantic segmentation realizes the pixel level classification of objects and background categories in the road scene. However, when pedestrian and other objects have occlusion problems, semantic segmentation is difficult to directly separate a single instance, and the anchor frame generated by object detection contains redundant information. To solve this problem, this paper proposes a method combining target detection and semantic segmentation. This method first uses YOLOv5 model to complete the target detection, and detects people, vehicles and other objects in the captured traffic scene image. At the same time, the improved DeepLabv3+ network model is used to capture the semantic and regional information of roads in the captured traffic scene image. Finally, the prediction results of the output of the two task branches are drawn in the image to be detected, and finally the drawing results are combined and output uniformly. This method can effectively distinguish different people, vehicles, roads and other information in the traffic scene, "complement each other" to understand the driverless road scene, and improve the detection accuracy. The experimental results show that the average mAP of this method is 79.11%, and the segmentation accuracy is high, which is suitable for the unmanned driving scene on real urban roads.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call