Abstract

SummaryExisting convolutional neural networks (CNNs) have achieved remarkable performance in medical image segmentation tasks. However, they still fail to generalize well to unseen datasets due to the limited size and diversity of training data as well as distribution shifts. Meanwhile, CNN-based methods have inherent limitations in capturing global contexts and suffer semantic dilution issues in the decoder stage, which leads to suboptimal predictions especially under low inter-class discrepancy and complex backgrounds. In this paper, we propose a novel framework named CCNet that learns complementary and contrastive features for accurate segmentation. Firstly, a novel complementary feature extraction module is formulated to learn global–local features by coordinating Transformer and CNN-style parallel branches. Secondly, a global context refinement module is constructed to adaptively generate a set of layer-specific global maps, so as to remedy semantic dilution. Thirdly, a mutual attentive module is designed to alleviate background confusion, in which contrastive cues are mutually captured from the foreground and background view by cascaded dual attention blocks. Moreover, we implement synthetic data augmentation to deal with training data scarcity and distribution shifts, thereby improving the out-of-distribution generalization of our model. Extensive experiments demonstrate that our CCNet achieves outstanding performance in polyp, skin lesion, and nuclei segmentation tasks, outperforming the state-of-the-arts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call