Weakly supervised semantic segmentation (WSSS) utilizes weak labels to learn semantic segmentation models, significantly reducing reliance on pixel-level annotations. WSSS typically employs a multi-label classification network to extract image features for constructing localization maps. The quality of the localization map critically influences the performance of WSSS. However, non-target semantic noise within the features impedes the improvement of localization map quality. To address this issue, we propose a non-target feature filtering class activation mapping (NFF-CAM) for WSSS, which can reduce non-target semantic signals and generate higher-quality localization maps. Specifically, the class-constrained dual cosine clustering (CDCC) and channel identification (CI) modules are introduced in NFF-CAM. CDCC effectively addresses the issue of unsuitability in the clustering group relationships of the original features under specified class conditions. CI can efficiently identify channel features containing non-target semantic information. We conduct extensive evaluations of NFF-CAM on popular datasets, including PASCAL VOC 2012 and MS COCO 2014. Experimental results show that NFF-CAM can effectively improve the segmentation performance of off-the-shelf methods.
Read full abstract