LBARNet: Lightweight bilateral asymmetric residual network for real-time semantic segmentation

Xuegang Hu,Baoman Zhou

doi:10.1016/j.cag.2023.07.039

Abstract

Real-time semantic segmentation, as a key technique for scene understanding, has been an important research topic in the field of computer vision in recent years. However, existing models are unable to achieve good segmentation accuracy on mobile devices due to their huge computational overhead, which makes it difficult to meet actual industrial requirements. To address the problems faced by current semantic segmentation tasks, this paper proposes a lightweight bilateral asymmetric residual network (LBARNet) for real-time semantic segmentation. First, we propose the bilateral asymmetric residual (BAR) module. This module learns multi-scale feature representations with strong semantic information at different stages of the semantic information extraction branch, thus improving pixel classification performance. Secondly, the spatial information extraction (SIE) module is constructed in the spatial detail extraction branch to capture multi-level local features of the shallow network to compensate for the lost geometric information in the downsampling stage. At the same time, we design the attention mechanism perception (AMP) module in the jump connection part to enhance the contextual representation. Finally, we design the dual branch feature fusion (DBF) module to exploit the correspondence between higher-order features and lower-order features to fuse spatial and semantic information appropriately. The experimental results show that LBARNet, without any pre-training and pre-processing and using only 0.6M parameters, achieves 70.8% mloU and 67.2% mloU on the Cityscapes dataset and Camvid dataset, respectively. LBARNet maintain a high segmentation accuracy while using a smaller number of parameters compared to most existing state-of-the-art models.

Full Text