Attention based lightweight asymmetric network for real-time semantic segmentation

Qian Liu,Cunbao Wang,Zhensheng Li,Youwei Qi,Jiongtao Fang

doi:10.1016/j.engappai.2023.107736

Qian Liu, Cunbao Wang + Show 3 more

https://doi.org/10.1016/j.engappai.2023.107736

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Real-time semantic segmentation is one of the important tasks in the field of computer vision, which is widely used in the fields of autonomous driving and medical imaging. Existing lightweight networks usually improve inference speed at the sacrifice of segmentation accuracy. How to achieve a balance between accuracy and speed is still a challenging problem for real-time semantic segmentation. In this paper, we propose an attention based lightweight asymmetric network (ALANet) to address this problem. Specifically, in the encoder, a channel-wise attention based depth-wise asymmetric block (CADAB) is designed to extract sufficient features, which has a small number of parameters. In the decoder, a spatial attention based pyramid pooling (SAPP) module is presented to aggregate multi-scale context information by using a few convolutions and poolings; and a pixel-wise attention based multi-scale feature fusion (PAMFF) module is developed to fuse features from different scales and generate pixel-wise attention for improving image restoration. Our ALANet has only 1.32M parameters. Experimental results on the Cityscapes and CamVid datasets show that ALANet obtains the segmentation accuracy (mIoU) of 74.4% and 69.5% and the inference speed of 115.6FPS and 113.2FPS, respectively. These results demonstrate that ALANet achieves a good balance between accuracy and speed.

Full Text