Abstract

Semantic segmentation is a very important and challenging problem in computer vision. Many applications, such as automated driving and robotic navigation in urban road scenes, require accurate and efficient segmentation. Nowadays, system models are often designed with high speed but a large number of parameters, or they take up a lot of memory space with a very small speed, so they are not suitable for real-time semantic segmentation conditions. In order to solve this problem, we propose a more comprehensive model that has not only a faster speed, but also a smaller number of parameters and a higher accuracy which is termed as Lightweight Asymmetric Dilation Network (LADNet). Our model is based on our Lightweight Asymmetric Dilation Module (LAD Module) which provides a larger receptive field than all existing lightweight models to learn more information, while Lightweight Asymmetric Dilation-A (LAD-A) can better perceive spatial and semantic information, and Lightweight Asymmetric Dilation-B (LAD-B) can better perceive semantic information. Our Lightweight Downsampling Module (LDM) downsamples the feature map, it can greatly reduce model parameters. Finally, our Attention Enhancement Decoder (AED) to restore the feature map to the same size as the resolution of the original image, AED enables two attentional feature maps to simultaneously guide semantic information for better semantic segmentation of images. Our extensive experiments on the Cityscapes, CamVid, and NYUv2 test set show that our model is able to achieve the best balance in parameters, accuracy, and speed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call