Real-Time Semantic Segmentation Network Based on Lite Reduced Atrous Spatial Pyramid Pooling Module Group

Yangsheng Tian,Shuiping Zhang,Haihui Wang,Fangyuan Chen

doi:10.1109/crc51253.2020.9253492

Yangsheng Tian, Shuiping Zhang + Show 2 more

https://doi.org/10.1109/crc51253.2020.9253492

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Objective: With the development of mobile devices and robots, semantic segmentation network begins to pay more and more attention to the efficiency of the model. To speed up the model inference, most of the existing real-time semantic segmentation methods always reduce the size of the image before segmentation, which sacrifices the details and spatial information of the image. In order to achieve a good balance between speed and accuracy, we propose a eﬃcient real-time semantic segmentation network based on lite reduced atrous pyramid pooling. Method: The proposed network structure uses the encoder decoder structure to construct an end-to-end semantic segmentation network. In the encoder part, the image is quickly sampled underground to avoid too much operation on the large-resolution feature map and increase the calculation amount. In the decoder part, the pyramid pooling module of light-weight empty space is used to further obtain the context information to expand the feeling field of the model. The large-size pooling core of litereduced atrous spatial pyramid pooling module group can quickly and effectively extract the context information from the large-size input. Then, skip connection and feature fusion module can fuse multi-scale feature information. Finally, we get the semantic output through up sampling. Result: The network proposed in this paper has been tested on Cityscape data set. For 2048*1024 resolution input image, using a NVIDIA Tesla V100 video card, the network achieves 88.4FPS speed and 74.5% Miou accuracy. Conclusion: The experimental results show that the network structure proposed in this paper is faster than other networks in the control group when dealing with 2048*1024 size pictures, which shows that the research in this paper is valuable.

Full Text