Abstract

AbstractMedical image segmentation remains particularly challenging for complex and low‐contrast anatomical structures, especially in brain MRI glioma segmentation. Gliomas appear with extensive heterogeneity in appearance and location on brain MR images, making robust tumour segmentation extremely challenging and leads to highly variable even in manual segmentation. U‐Net has become the de facto standard in medical image segmentation tasks with great success. Previous researches have proposed various U‐Net‐based 2D Convolutional Neural Networks (2D‐CNN) and their 3D variants, called 3D‐CNN‐based architectures, for capturing contextual information. However, U‐Net often has limitations in explicitly modelling long‐term dependencies due to the inherent locality of convolution operations. Inspired by the recent success of natural language processing transformers in long‐range sequence learning, a multi‐view 2D U‐Nets with transformer (TransMVU) method is proposed, which combines the advantages of transformer and 2D U‐Net. On the one hand, the transformer encodes the tokenized image patches in the CNN feature map into an input sequence for extracting global context for global feature modelling. On the other hand, multi‐view 2D U‐Nets can provide accurate segmentation with fewer parameters than 3D networks. Experimental results on the BraTS20 dataset demonstrate that our model outperforms state‐of‐the‐art 2D models and classic 3D model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call