Abstract

Weakly supervised semantic segmentation (WSSS) has garnered considerable attention for its efficacy in generating pixel-level labels using weak labels. Class activation maps (CAMs) are utilized by WSSS to generate pseudo-masks from image-level labels. However, these CAMs primarily focus on the most discriminative features of an object, while less discriminative features may be ignored or unidentified. Due to co-occurring pixels, it may also be impossible to distinguish between the foreground and background. In this paper, we propose a method referred to as Graph RecalibratiOn with Scaling Weight uNit (GROWN) to address these challenges. It illustrates the relation between local and global features by utilizing graph structure. Adaptively representing the image’s semantic features is possible by scaling weights that aggregate contextual features. The proposed method successfully captures long-range dependencies and extracts contextual features to improve the pseudo-mask quality. As a result, the proposed method can predict pixel-level labels effectively. The datasets PASCAL VOC 2012 and MS COCO were utilized in the experiments. GROWN outperforms state-of-the-art WSSS methods that employ image-level labels, as demonstrated by the results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call