SwinBTS: A Method for 3D Multimodal Brain Tumor Segmentation Using Swin Transformer.

Yun Jiang,Yuan Zhang,Tongtong Cheng,Jinkun Dong,Jing Liang,Xin Lin

doi:10.3390/brainsci12060797

Yun Jiang, Yuan Zhang + Show 4 more

Open Access

https://doi.org/10.3390/brainsci12060797

Copy DOI

Journal: Brain sciences	Publication Date: Jun 17, 2022
Citations: 64	License type: CC BY 4.0

Affiliation: Northwest Normal University

Abstract

Brain tumor semantic segmentation is a critical medical image processing work, which aids clinicians in diagnosing patients and determining the extent of lesions. Convolutional neural networks (CNNs) have demonstrated exceptional performance in computer vision tasks in recent years. For 3D medical image tasks, deep convolutional neural networks based on an encoder–decoder structure and skip-connection have been frequently used. However, CNNs have the drawback of being unable to learn global and remote semantic information well. On the other hand, the transformer has recently found success in natural language processing and computer vision as a result of its usage of a self-attention mechanism for global information modeling. For demanding prediction tasks, such as 3D medical picture segmentation, local and global characteristics are critical. We propose SwinBTS, a new 3D medical picture segmentation approach, which combines a transformer, convolutional neural network, and encoder–decoder structure to define the 3D brain tumor semantic segmentation job as a sequence-to-sequence prediction challenge in this research. To extract contextual data, the 3D Swin Transformer is utilized as the network’s encoder and decoder, and convolutional operations are employed for upsampling and downsampling. Finally, we achieve segmentation results using an improved Transformer module that we built for increasing detail feature extraction. Extensive experimental results on the BraTS 2019, BraTS 2020, and BraTS 2021 datasets reveal that SwinBTS outperforms state-of-the-art 3D algorithms for brain tumor segmentation on 3D MRI scanned images.

Full Text