Abstract

Acquiring sufficient ground-truth supervision to train deep vi- sual models has been a bottleneck over the years due to the data-hungry nature of deep learning. This is exacerbated in some structured prediction tasks, such as semantic segmen- tation, which requires pixel-level annotations. This work ad- dresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level anno- tations and pixel-level segmentation. We formulate WSSS as a novel group-wise learning task that explicitly models se- mantic dependencies in a group of images to estimate more reliable pseudo ground-truths, which can be used for training more accurate segmentation models. In particular, we devise a graph neural network (GNN) for group-wise semantic min- ing, wherein input images are represented as graph nodes, and the underlying relations between a pair of images are char- acterized by an efficient co-attention mechanism. Moreover, in order to prevent the model from paying excessive atten- tion to common semantics only, we further propose a graph dropout layer, encouraging the model to learn more accurate and complete object responses. The whole network is end-to- end trainable by iterative message passing, which propagates interaction cues over the images to progressively improve the performance. We conduct experiments on the popular PAS- CAL VOC 2012 and COCO benchmarks, and our model yields state-of-the-art performance. Our code is available at: https://github.com/Lixy1997/Group-WSSS.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.