A cross-scale mixed attention network for smoke segmentation

Feiniu Yuan,Yu Shi,Lin Zhang,Yuming Fang

doi:10.1016/j.dsp.2023.103924

Abstract

Deep neural networks have achieved good progresses in smoke segmentation, but it is a challenging task to accurately segment smoke images due to smoke semi-transparency and large variance. To improve performance, we propose a Cross-scale Mixed-attention Network (CMNet) by designing multi-scale and mixed attention modules. We first concatenate the results of average and maximum pooling along each axis to learn powerful attention coefficients, which are used to weight the original input for producing a directional attention map along each axis. Then we use point-wise additions for combining three directional attention maps to propose a module of Fused 3D Attention (F3A). In addition, we adopt atrous convolutions to generate multi-scale feature maps, then point-wisely add the results of average and maximum pooling on each scale feature map, and design a bottleneck structure to produce an effective attention map for each scale and reduce learnable parameters simultaneously. All attention feature maps with different scales are concatenated to obtain Multi-scale Channel Attention (MCA). Finally, we cross-wisely stack the modules of F3A and MCA on both shallow and deep feature maps to propose Mixed Cross Enhancement (MCE) for fully fusing information across scales. Experiments show that our method surpasses most existing methods.

Full Text