Accurate medical image segmentation plays a vital role in clinical practice. Convolutional Neural Network and Transformer are mainstream architectures for this task. However, convolutional neural network lacks the ability of modeling global dependency while Transformer cannot extract local details. In this paper, we propose DATTNet, DualATTentionNetwork, an encoder-decoder deep learning model for medical image segmentation. DATTNet is exploited in hierarchical fashion with two novel components: (1) Dual Attention module is designed to model global dependency in spatial and channel dimensions. (2) Context Fusion Bridge is presented to remix the feature maps with multiple scales and construct their correlations. The experiments on ACDC, Synapse and Kvasir-SEG datasets are conducted to evaluate the performance of DATTNet. Our proposed model shows superior performance, effectiveness and robustness compared to SOTA methods, with mean Dice Similarity Coefficient scores of 92.2%, 84.5% and 89.1% on cardiac, abdominal organs and gastrointestinal poly segmentation tasks. The quantitative and qualitative results demonstrate that our proposed DATTNet attains favorable capability across different modalities (MRI, CT, and endoscopy) and can be generalized to various tasks. Therefore, it is envisaged as being potential for practicable clinical applications. The code has been released on https://github.com/MhZhang123/DATTNet/tree/main.
Read full abstract