Trans-U: Transformer Enhanced U-Net for Medical Image Segmentation

Shirong Guo,Zhikang Lai,Shaowei Sheng,Shanhong Chen

doi:10.1109/cvidliccea56201.2022.9824530

Abstract

The growth of healthcare systems, particularly illness diagnosis and treatment planning, requires the segmentation of medical images. U-Net became the basic standard for several medical image segmentation tasks, with great success. Due to the intrinsic locality of convolutional processes, U-Net frequently has constraints in explicitly representing long-range dependence. Transformer, which is built for sequence-to-sequence prediction, has emerged as a viable architecture with an intrinsic global self-attention mechanism, however owing to insufficient low-level information, it may have restricted localization capabilities. In this paper, Trans-U is proposed as an powerful option for medical image segmentation in this research combined with Transformer and U-Net. Transformer utilizes tokenized picture patching from convolutional neural network feature maps as input data in order to extract global context. To achieve exact localization, the decoder upsamples the encoded features, which are subsequently integrated with the high-resolution CNN feature maps. We claim that Transformers can be effective encoders for medical picture segmentation tasks, especially when used with U-Net to recover localized spatial information and improve finer details. Trans-U outperforms many rival approaches in a variety of medical applications, such as multi-organ segmentation and cardiac segmentation.

Full Text