Semi-supervised pathological image segmentation via cross distillation of multiple attentions and Seg-CAM consistency

Lanfeng Zhong,Xiangde Luo,Xin Liao,Shaoting Zhang,Guotai Wang

doi:10.1016/j.patcog.2024.110492

Abstract

Segmentation of pathological images is a crucial step for accurate cancer diagnosis. However, acquiring dense annotations of such images for training is labor-intensive and time-consuming. To address this issue, Semi-Supervised Learning (SSL) has the potential for reducing the annotation cost, but it is challenged by a large number of unlabeled training images. In this paper, we propose a novel SSL method based on Cross Distillation of Multiple Attentions and Seg-CAM Consistency (CDMA+) to effectively leverage unlabeled images. First, we propose a Multi-attention Tri-decoder Network (MTNet) that consists of a shared encoder and three decoders, with each decoder using a different attention mechanism that calibrates features in different aspects to generate diverse outputs. Second, we introduce Cross Decoder Knowledge Distillation (CDKD) between the three decoders, allowing them to learn from each other’s soft labels to mitigate the negative impact of incorrect pseudo labels during training. Subsequently, motivated by the observation that the Class Activation Maps (CAMs) derived from the classification task could provide a rough segmentation, we employ an auxiliary classification head and introduce a consistency constraint between the CAM and segmentation results, i.e. Seg-CAM consistency. Additionally, uncertainty minimization is applied to the average prediction of the three decoders, which further regularizes predictions on unlabeled images and encourages inter-decoder consistency. Our proposed CDMA+ was compared with eight state-of-the-art SSL methods on two public pathological image datasets, and the experimental results showed that our method outperforms the other approaches under different annotation ratios. The code is available at https://github.com/HiLab-git/CDMA.

Full Text