Abstract

Compared with other objects, smoke semantic segmentation (SSS) is more difficult and challenging due to some special characteristics of smoke, such as non-rigid, translucency, variable mode and so on. To achieve accurate positioning of smoke in real complex scenes and promote the development of intelligent fire detection, we propose a Smoke-Aware Global-Interactive Non-local Network (SAGINN) for SSS, which harness the power of both convolution and transformer to capture local and global information simultaneously. Non-local is a powerful means for modeling long-range context dependencies, however, friendliness to single-scale low-resolution features limits its potential to produce high-quality representations. Therefore, we propose a Global-Interactive Non-local (GINL) module, leveraging global interaction between multi-scale key information to improve the robustness of feature representations. To solve the interference of smoke-like objects, a Pyramid High-level Semantic Aggregation (PHSA) module is designed, where the learned high-level category semantics from classification aids model by providing additional guidance to correct the wrong information in segmentation representations at the image level and alleviate the inter-class similarity problem. Besides, we further propose a novel loss function, termed Smoke-aware loss (SAL), by assigning different weights to different objects contingent on their importance. We evaluate our SAGINN on extensive synthetic and real data to verify its generalization ability. Experimental results show that SAGINN achieves 83% average mIoU on the three testing datasets (83.33%, 82.72% and 82.94%) of SYN70K with an accuracy improvement of about 0.5%, 0.002 mMse and 0.805 Fβ on SMOKE5K, which can obtain more accurate location and finer boundaries of smoke, achieving satisfactory results on smoke-like objects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call