Lesions of early cancers often show flat, small, and isochromatic characteristics in medical endoscopy images, which are difficult to be captured. By analyzing the differences between the internal and external features of the lesion area, we propose a lesion-decoupling-based segmentation (LDS) network for assisting early cancer diagnosis. We introduce a plug-and-play module called self-sampling similar feature disentangling module (FDM) to obtain accurate lesion boundaries. Then, we propose a feature separation loss (FSL) function to separate pathological features from normal ones. Moreover, since physicians make diagnoses with multimodal data, we propose a multimodal cooperative segmentation network with two different modal images as input: white-light images (WLIs) and narrowband images (NBIs). Our FDM and FSL show a good performance for both single-modal and multimodal segmentations. Extensive experiments on five backbones prove that our FDM and FSL can be easily applied to different backbones for a significant lesion segmentation accuracy improvement, and the maximum increase of mean Intersection over Union (mIoU) is 4.58. For colonoscopy, we can achieve up to mIoU of 91.49 on our Dataset A and 84.41 on the three public datasets. For esophagoscopy, mIoU of 64.32 is best achieved on the WLI dataset and 66.31 on the NBI dataset.
Read full abstract