Abstract

AbstractVisual smoke semantic segmentation is a challenging task due to semi‐transparency, variable shapes, and complex textures of smoke. To improve segmentation performance, a convolutional neural network and transformer hybrid network are proposed based on pyramid Gaussian pooling (PGP) for smoke segmentation. In order to utilize low‐pass filtering to suppress noise, a PGP method is designed. Then, the output of PGP is reshaped to construct a set of visual tokens for transformers, thus a PGP‐transformer module is presented to make full use of the self‐attention mechanism. Finally, the PGP‐transformer module is inserted into the U‐shaped architecture with skip connections. A large number of experiments have proved that the method is significantly superior to existing state‐of‐the‐art algorithms on virtual and real smoke datasets, and ablation experiments have also verified the effectiveness of the proposed modules.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call