Learning class-agnostic masks with cross-task refinement for weakly supervised semantic segmentation

Lian Xu,Mohammed Bennamoun,Farid Boussaid,Wanli Ouyang,Dan Xu

doi:10.1007/s00521-023-08826-0

Abstract

Weakly supervised semantic segmentation (WSSS) commonly relies on Class Activation Mapping (CAM) to produce pseudo semantic labels using image-level annotations. However, because CAM maps often form sparse object regions with poor boundaries, they cannot provide sufficient segmentation supervision. Because off-the-shelf saliency maps can provide rich object boundaries that can be leveraged to improve semantic segmentation, we propose to jointly learn semantic segmentation and class-agnostic masks by using image-level annotations and off-the-shelf saliency maps as supervision. We also propose a cross-task label refinement mechanism, which takes advantage of the learned class-agnostic masks and semantic segmentation masks, to refine the pseudo labels and provide more accurate supervision to both tasks. Moreover, we introduce a new normalization method for CAM to generate more complete class-specific localization maps. The improved CAM maps complement our learned class-agnostic masks, leading to high-quality pseudo semantic segmentation labels. Extensive experiments demonstrate the effectiveness of the proposed approach, with state-of-the-art WSSS results established on PASCAL VOC 2012 and MS COCO.

Full Text