Abstract

The U-net network, with its simple and powerful encoder–decoder structure, dominates the field of medical image segmentation. However, convolution operations are limited by receptive fields. They do not have the ability to model long-range dependencies, but Transformer has the capability of long-term modeling thanks to its core self-attention mechanism, which has been widely applied in the field of medical image segmentation. However, both CNNs and Transformer can only perform correlation calculations for a single sample, ignoring the correlation between different samples. To address these problems, we propose a new Transformer, which we call the Dual-Attention Transformer (DAT). This module captures correlations within a single sample while also learning correlations between different samples. The current U-net and some of its variant models have the problem of inadequate feature fusion, so we also improve the skip connection to strengthen the association between feature maps at different scales, reduce the semantic gap between the encoder and decoder, and further improve the segmentation performance. We refer to this structure as DATUnet. We conducted extensive experiments on the Synapse and ACDC datasets to validate the superior performance of our network, and we achieved an average DSC (%) of 83.6 and 90.9 and an average HD95 of 13.99 and 1.466 for the Synapse and ACDC datasets, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.