Abstract

Models that utilize self-attention mechanisms, including but not limited to Vision Transformers (ViTs), have shown promising performance in visual tasks like semantic segmentation. This is attributed to their capacity to capture global features of images, enabling them to learn more comprehensive representations. However, transformer-based models typically demand a considerable amount of training data to achieve satisfactory performance, while being deficient in the ability to efficiently extract local image features. As a result, these models may not be as effective in some computer vision tasks that involve small-scale datasets, like medical image segmentation. To address these issues, this paper proposes a dual-stream encoding-based transformer dubbed as Dual-stream Transformer (DS-Former). The dual-stream module in DS-Former can simultaneously acquire local and global features in the image and construct relation between the two kinds of features via self-attention. Compared with the simple splicing or serial connection, the dual-stream module can extract more comprehensive and hierarchical feature information from the fusion interaction of the two features. Our method is evaluated on the UK Biobank (UKBB) cardiac magnetic resonance imaging (CMR) dataset and The Beyond the Cranial Vault (BTCV) abdominal challenge dataset. The experimental results indicate that our DS-Former outperforms other state-of-the-art approaches on both datasets, indicating its potential for medical images semantic segmentation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.