Abstract

AbstractSemantic segmentation of tumours plays a crucial role in fundamental medical image analysis and has a significant impact on cancer diagnosis and treatment planning. UNet and its variants have achieved state-of-the-art results on various 2D and 3D medical image segmentation tasks involving different imaging modalities. Recently, researchers have tried to merge the multi-head self-attention mechanism, as introduced by the Transformer, into U-shaped network structures to enhance the segmentation performance. However, both suffer from limitations that make networks under-perform on voxel-level classification tasks, the Transformer is unable to encode positional information and translation equivariance, while the Convolutional Neural Network lacks global features and dynamic attention. In this work, a new architecture named TCTNet Tumour Segmentation with 3D Direction-Wise Convolution and Transformer) is introduced, which comprises an encoder utilising a hybrid Transformer-Convolutional Neural Network (CNN) structure and a decoder that incorporates 3D Direction-Wise Convolution. Experimental results show that the proposed hybrid Transformer-CNN network structure obtains better performance than other 3D segmentation networks on the Brain Tumour Segmentation 2021 (BraTS21) dataset. Two more tumour datasets from Medical Segmentation Decathlon are also utilised to test the generalisation ability of the proposed network architecture. In addition, an ablation study was conducted to verify the effectiveness of the designed decoder for the tumour segmentation tasks. The proposed method maintains a competitive segmentation performance while reducing computational effort by 10% in terms of floating-point operations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call