Region-based online selective examination for weakly supervised semantic segmentation

Qi Chen,Yun Chen,Yuheng Huang,Xiaohua Xie,Lingxiao Yang

doi:10.1016/j.inffus.2024.102311

Abstract

Current weakly supervised semantic segmentation methods usually generate noisy pseudo-labels. Training segmentation models with these labels tends to overfit the noise, leading to poor performance. Existing approaches often rely on iterative updates of pseudo-labels at pixel or image-level, ignoring the importance of region-level characteristics. The recently introduced Segment Anything Model (SAM) advances multiple approaches by fusing such region-level masks with noisy pseudo-labels. However, the fusion of noisy pseudo-labels using SAM is still challenging due to the lack of semantic information. To address these challenges, we propose a Region-based Online Selective Examination (ROSE). To be specific, we first consolidate SAM masks in a bottom-up manner to form a unified region prior. Then, leveraging these priors, region-level visual information is aggregated through the proposed region voting strategy. Furthermore, a cross-view selective examination method effectively explores semantic consistency between different image views and performs an examination to correct noisy pseudo-labels. The experimental results show that our ROSE achieves a new state-of-the-art on the Pascal VOC and COCO datasets. Moreover, the training time of our ROSE is over 10 times faster than previous methods.

Full Text