Abstract

Deep neural networks have achieved good progresses in smoke segmentation, but it is a challenging task to accurately segment smoke images due to smoke semi-transparency and large variance. To improve performance, we propose a Cross-scale Mixed-attention Network (CMNet) by designing multi-scale and mixed attention modules. We first concatenate the results of average and maximum pooling along each axis to learn powerful attention coefficients, which are used to weight the original input for producing a directional attention map along each axis. Then we use point-wise additions for combining three directional attention maps to propose a module of Fused 3D Attention (F3A). In addition, we adopt atrous convolutions to generate multi-scale feature maps, then point-wisely add the results of average and maximum pooling on each scale feature map, and design a bottleneck structure to produce an effective attention map for each scale and reduce learnable parameters simultaneously. All attention feature maps with different scales are concatenated to obtain Multi-scale Channel Attention (MCA). Finally, we cross-wisely stack the modules of F3A and MCA on both shallow and deep feature maps to propose Mixed Cross Enhancement (MCE) for fully fusing information across scales. Experiments show that our method surpasses most existing methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.