High-resolution rectified gradient-based visual explanations for weakly supervised segmentation

Tianyou Zheng,Qiang Wang,Yue Shen,Xiang Ma,Xiaotian Lin

doi:10.1016/j.patcog.2022.108724

Abstract

Visual explanations for convolutional neural networks (CNNs) act as the backbone for weakly supervised segmentation with image-level labels. This paper proposes a high-resolution rectified gradient-based class activation mapping with bounding box annotations (bbox) to improve the initial seed for weakly supervised segmentation (WSS) tasks. HRCAM extends Grad-CAM by separating the gradient maps from the class activation maps from the shallow layer for higher resolution. Gradient rectified methods are proposed to improve the visualization and WSS score. Experiments and evaluations are conducted to verify the performance of HRCAM-BB on Pascal VOC 2012 and COCO datasets. On Pascal VOC 2012 set, our method achieves outstanding performance with a mean intersection over union (mIOU) of 71.6 with image-level labels and 78.2 with bbox on WSSS, and increases the WSIS mIOU (AP50) to 52.1 with image-level labels, and 61.9 with bbox. our method surpasses the previous SOTA approach in the same condition.

Full Text