Abstract

In recent years, image-level weakly supervised semantic segmentation (WSSS) has developed rapidly in natural scenes due to the easy availability of classification tags. However, limited to complex backgrounds, multi-category scenes, and dense small targets in remote sensing (RS) images, relatively little research has been conducted in this field. To alleviate the impact of the above problems in RS scenes, a self-supervised Siamese network based on an explicit pixel-level constraints framework is proposed, which greatly improves the quality of class activation maps and the positioning accuracy in multi-category RS scenes. Specifically, there are three novel devices in this paper to promote performance to a new level: (a) A pixel-soft classification loss is proposed, which realizes explicit constraints on pixels during the image-level training; (b) A pixel global awareness module, which captures high-level semantic context and low-level pixel spatial information, is constructed to improve the consistency and accuracy of RS object segmentation; (c) A dynamic multi-scale fusion module with a gating mechanism is devised, which enhances feature representation and improves the positioning accuracy of RS objects, particularly on small and dense objects. Experiments on two RS challenge datasets demonstrate that these proposed modules achieve new state-of-the-art results by only using image-level labels, which improve mIoU to 36.79% on iSAID and 45.43% on ISPRS in the WSSS task. To the best of our knowledge, this is the first work to perform image-level WSSS on multi-class RS scenes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.