Semantic Segmentation of Street Scene Based on Multi-Scale and Attention Mechanism

Tingting Xu ,Zheng Guping

doi:10.6911/wsrj.202105_7(5).0030

Abstract

Street scene images contain various objects of different scales. The segmentation model with single scale and feature extraction and fusion can not get good segmentation and prediction results. Therefore, a semantic segmentation model based on multi-scale feature fusion and attention mechanism is proposed. Firstly, the asymmetric structure of atrous spatial pyramid pooling (ASPP) is used to optimize the extraction of different levels and scales of street scene image. Secondly, the attention mechanism is introduced into the feature maps of different scales, so that the network can focus on the salient features of each level. Finally, all the feature images are adjusted to the same size for fusion, and the key feature information of each scale object in the street scene is fully extracted to segment it effectively. The experimental results on the dataset Cityscapes show that the semantic segmentation network model based on multi-scale and attention mechanism can further improve the segmentation accuracy and optimize the segmentation results.

Full Text