Abstract
Accurate segmentation of the optic disk (OD) and optic cup (OC) regions of the optic nerve head is a critical step in glaucoma diagnosis. Existing architectures based on convolutional neural networks (CNNs) still suffer from insufficient global information and poor generalization ability to small sample datasets. Besides, advanced transformer-based models, although capable of capturing global image features, perform poorly in medical image segmentation due to numerous parameters and insufficient local spatial information. To address the above two problems, we propose an innovative W-shaped hybrid network framework, CC-TransXNet, which combines the advantages of CNN and transformer. Firstly, by employing TransXNet and improved ResNet as feature extraction modules, the network considers local and global features to enhance its generalization ability. Secondly, the convolutional block attention module (CBAM) is introduced in the residual structure to improve the ability to recognize the OD and OC by applying attention in both the channel and spatial dimensions. Thirdly, the Contextual Attention (CoT) self-attention mechanism is used in the skip connection to adaptively allocate attention to the contextual information, further enhancing the segmentation's accuracy. We conducted experiments on four publicly available datasets (REFUGE 2, RIM-ONE DL, GAMMA, and Drishti-GS). Compared with the traditional U-Net, CNN, and transformer-based networks, our proposed CC-TransXNet improves the segmentation accuracy and significantly enhances the generalization ability on small datasets. Moreover, CC-TransXNet effectively controls the number of parameters in the model through optimized design to avoid the risk of overfitting, proving its potential for efficient segmentation.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have