Abstract
Encode-decoder structure is used in deep learning for real-time dense segmentation task. On account of the limitation of calculation burden on mobile devices, we present a light-weight asymmetric encoder-decoder network in this paper, namely LAENet, which quickly and efficiently accomplish the task of real-time semantic segmentation. We employ an asymmetric convolution and group convolution structure combined with dilated convolution and dense connectivity to reduce computation cost and model size, which can guarantee adequate receptive field and enhance the model learning ability in encoder. On the other hand, feature pyramid networks (FPN) structure combine attention mechanism and ECRE block are utilized in the decoder to strike a balance between the network complexity and segmentation performance. Our approach achieves only have 0.84M parameters, and is able to reach 66 FPS in a single GTX 1080Ti GPU. Experiments on Cityscapes datasets demonstrate that superior performance of LAENet is better than the existing segmentation network, in terms of speed and accuracy trade-off without any post-processing.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.