Abstract

In recent years, Transformers have achieved success in the field of medical image segmentation due to their outstanding capability to model long-range dependencies. However, many existing segmentation methods only use Transformer as an auxiliary module to capture global context information in images, limiting the potential of the Transformers. Additionally, self-attention mechanism within the Transformers can lead to attention collapse issues, thus triggering semantic gap between the encoder and decoder. Furthermore, most networks have difficulties in effectively handling multi-scale and multi-channel feature information. To address the above problems, we propose a hybrid Convolutional Neural Networks (CNNs) and Transformers method for medical image segmentation (HCA-Former). We design a local multi-channel attention block (LMCA) to effectively combine the features of CNN and Transformers, enabling multi-channel information extraction and interaction. Using the Double-Former Block (DFB) alleviates the semantic gap between the encoder and decoder, restoring more detailed information. Moreover, the utilization of the global multi-scale attention block (GMSA) can establish information interaction among multi-scale targets, thereby enhancing generalization capability of the model. To validate the effectiveness of our approach, we evaluate the proposed method on three challenging tasks: the MICCAI 2015 Multi-Image Abdominal Marker Challenge (Synapse), Automated Cardiac Diagnosis Dataset (ACDC), and Medical Segmentation Decathlon Brain Tumor Segmentation (MSD brain tumor). Extensive experiments demonstrate that our HCA-Former achieved competitive or better performance than state-of-the-art approaches for 3D medical image segmentation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call