Abstract

The Transformer architecture has been widely applied in the field of image segmentation due to its powerful ability to capture long-range dependencies. However, its ability to capture local features is relatively weak and it requires a large amount of data for training. Medical image segmentation tasks, on the other hand, demand high requirements for local features and are often applied to small datasets. Therefore, existing Transformer networks show a significant decrease in performance when applied directly to this task. To address these issues, we have designed a new medical image segmentation architecture called CT-Net. It effectively extracts local and global representations using an asymmetric asynchronous branch parallel structure, while reducing unnecessary computational costs. In addition, we propose a high-density information fusion strategy that efficiently fuses the features of two branches using a fusion module of only 0.05M. This strategy ensures high portability and provides conditions for directly applying transfer learning to solve dataset dependency issues. Finally, we have designed a parameter-adjustable multi-perceptive loss function for this architecture to optimize the training process from both pixel-level and global perspectives. We have tested this network on 5 different tasks with 9 datasets, and compared to SwinUNet, CT-Net improves the IoU by 7.3% and 1.8% on Glas and MoNuSeg datasets respectively. Moreover, compared to SwinUNet, the average DSC on the Synapse dataset is improved by 3.5%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call