Abstract

AbstractAlthough convolutional neural networks (CNNs) have achieved remarkable progress in weakly supervised semantic segmentation (WSSS), there are still deficiencies of object incompleteness due to the lack of receptive field and insufficient utilization of global context information of CNN. Based on the above observations, we propose a simple and effective method, WegFormer. Specifically, WegFormer captures the global context information with the Vision Transformer (ViT) as the classification network and is equipped with Deep Taylor Decomposition (DTD) principle and Soft Erase (SE) module to generate more integral pseudo labels and smooth further. However, we observe that although the generated pseudo‐labels are more complete, they intrude into the background region, that is, background incompleteness problem. The Efficient Potential Object Mining (EPOM) module we propose solves this problem well. Extensive experiments on the challenging PASCAL VOC 2012 and MS COCO 2014 demonstrate the effectiveness of WegFormer, where superior results of and are obtained on the PASCAL VOC 2012 and MS COCO 2014 validation sets, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call