LightSeg: Local Spatial Perception Convolution for Real-Time Semantic Segmentation

Xiaochun Lei,Jiaming Liang,Zetao Jiang,Zhaoting Gong

doi:10.3390/app13148130

Xiaochun Lei, Jiaming Liang + Show 2 more

Open Access

https://doi.org/10.3390/app13148130

Copy DOI

Journal: Applied Sciences	Publication Date: Jul 12, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: Guilin University of Electronic Technology

Abstract

Semantic segmentation is increasingly being applied on mobile devices due to advancements in mobile chipsets, particularly in low-power consumption scenarios. However, the lightweight design of mobile devices poses limitations on the receptive field, which is crucial for dense prediction problems. Existing approaches have attempted to balance lightweight designs and high accuracy by downsampling features in the backbone. However, this downsampling may result in the loss of local details at each network stage. To address this challenge, this paper presents a novel solution in the form of a compact and efficient convolutional neural network (CNN) for real-time applications: our proposed model, local spatial perception convolution (LSPConv). Furthermore, the effectiveness of our architecture is demonstrated on the Cityscapes dataset. The results show that our model achieves an impressive balance between accuracy and inference speed. Specifically, our LightSeg, which does not rely on ImageNet pretraining, achieves an mIoU of 76.1 at a speed of 61 FPS on the Cityscapes validation set, utilizing an RTX 2080Ti GPU with mixed precision. Additionally, it achieves a speed of 115.7 FPS on the Jetson NX with int8 precision.

Full Text