Abstract

Herein, a dual-branch semantic segmentation model based on depth-separable convolution and attention mechanism is proposed for the real-time and accuracy requirement of semantic segmentation. The proposed approach overcomes the problems of poor segmentation effect and over-simplification of feature fusion arising from the constant downsample operations in semantic segmentation. The network is divided into spatial detail and semantic information paths. The spatial detail path utilizes a smaller downsample multiplier to maintain resolution and efficiently extract spatial information. The semantic information path is constructed by a non-bottleneck residual unit with dilated convolution; it extracts semantic features. For the feature aggregation problem, the feature-guided fusion module is designed to assign different weights to the parts of the two paths and fuse them to obtain the final output. The proposed algorithm achieves a segmentation accuracy of 69.6% and speed of 70 fps on the Cityscapes dataset, with a model parameter count of only 0.76 M, thus indicating some advantages over recent real-time semantic segmentation algorithms. The proposed method with depth separable convolution and attention mechanism can effectively extract features and compensate for the loss of accuracy caused by downsampling. The experiments demonstrate that the proposed fusion module outperforms other methods in fusing different features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call