Abstract

It has been difficult to achieve a suitable balance between effectiveness and efficiency in lightweight semantic segmentation networks in recent years. The goal of this work is to present an efficient and reliable semantic segmentation method called EBUNet, which is aimed at achieving a favorable trade-off between inference speed and prediction accuracy. Initially, we develop an Efficient Bottleneck Unit (EBU) that employs depth-wise convolution and depth-wise dilated convolution to obtain adequate features with moderate computation costs. Then, we developed a novel Image Partition Attention Module (IPAM), which divides feature maps into subregions and generates attention weights based on them. As a third step, we developed a novel lightweight attention decoder with which to retrieve spatial information effectively. Extensive experiments show that our EBUNet achieves 73.4% mIou and 152 FPS on the Cityscapes dataset and 72.2% mIoU and 147 FPS on the Camvid dataset with only 1.57 M parameters. The results of the experiment confirm that the proposed model is capable of making a decent trade-off in terms of accuracy, inference, and model size. The source code of our EBUNet is available at (https://github.com/Skybird1101/EBUNet).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call