FDTR: Weakening feature disparity transformer for accurate multicategory computed tomography image segmentation

Xupeng Kou,Zhan Tang,Houwei Feng,Lin Li

doi:10.1016/j.eswa.2023.121297

Abstract

Accurately segmenting object structures in computed tomography (CT) images is crucial for computer-aided surgery, diagnosis, and other interdisciplinary applications. Most state-of-the-art CT segmentation methods employ a classical jump structure to integrate shallow and deep feature information. However, the effectiveness of the current CT image segmentation and related subtasks must still be improved due to feature disparity between different layers and information transfer loss. To alleviate the problem, we propose a novel three-dimensional multicategory segmentation model, the weakening feature disparity transformer (FDTR), based on the transformer structure for CT imaging. Firstly, to effectively capture global features, we design a vision transformer-based encoder. Secondly, to enhance the model’s information representation in a lightweight manner, we devise a nested structure of dense compression connections. Lastly, to mitigate the disparity in semantic features at shallow layers, we propose supervised semantic signals. By employing the vision transformer-based encoder and the dense connection structure, we enhance the detailed information of deep features. Additionally, the semantic supervisory signal introduces deep semantic information to the shallow features, enriching the overall feature representation. We conducted extensive experiments on the KIPA2022 dataset for multiorgan segmentation and the YC2022 dataset for volumetric trait segmentation in broiler breeding. The proposed method, FDTR, effectively enhances feature transfer between shallow and deep features, achieving optimal results compared to other advanced models and demonstrating broad prospects for application.

Full Text