Win-Win Cooperation: Semantic Encoding Learning and Saliency Selection for Weakly supervised Semantic Segmentation

Yuhui Guo,Xun Liang,Bo Wu,Xiangping Zheng,Xuan Zhang

doi:10.1109/tcds.2022.3219860

Abstract

Image-level weakly supervised semantic segmentation methods have attracted increasing attention due to data labeling efficiency, but these methods mostly focus on utilizing the localization maps generated by the classification network to produce pseudo labels, leading to sparse object regions, object boundary mismatch and co-occurring pixels existing in the target objects. To address these issues, we propose a novel image-level weakly supervised semantic segmentation algorithm, namely Semantic Encoding Learning and Saliency Selection (SELSS), which mainly focuses on the improvement for target object identification and boundary quality. Specifically, we design a semantic encoding learning module to help the localization map from the classification network cover more semantic regions, which measures the euclidean distance between semantic words and localization maps to obtain the object coverage identification. In order to obtain accurate object boundaries and discard co-occurring pixels, we utilize the encoded localization maps for the foreground and the background to perform the saliency selection under the pseudo-pixel feedback. Under the cooperation between the semantic encoding learning and the saliency selection, our SELSS can better tackle the key challenges existing in weakly supervised semantic segmentation, significantly improving the quality of the generated pseudo labels. Extensive experiments demonstrate that our SELSS method achieves the state-of-the-art performance on the PASCAL VOC 2012 and MS COCO 2014 segmentation benchmarks.

Full Text