Abstract

Background and ObjectiveAccurate segmentation of esophageal gross tumor volume (GTV) indirectly enhances the efficacy of radiotherapy for patients with esophagus cancer. In this domain, learning-based methods have been employed to fuse cross-modality positron emission tomography (PET) and computed tomography (CT) images, aiming to improve segmentation accuracy. This fusion is essential as it combines functional metabolic information from PET with anatomical information from CT, providing complementary information. While the existing three-dimensional (3D) segmentation method has achieved state-of-the-art (SOTA) performance, it typically relies on pure-convolution architectures, limiting its ability to capture long-range spatial dependencies due to convolution's confinement to a local receptive field. To address this limitation and further enhance esophageal GTV segmentation performance, this work proposes a transformer-guided cross-modality adaptive feature fusion network, referred to as TransAttPSNN, which is based on cross-modality PET/CT scans. MethodsSpecifically, we establish an attention progressive semantically-nested network (AttPSNN) by incorporating the convolutional attention mechanism into the progressive semantically-nested network (PSNN). Subsequently, we devise a plug-and-play transformer-guided cross-modality adaptive feature fusion model, which is inserted between the multi-scale feature counterparts of a two-stream AttPSNN backbone (one for the PET modality flow and another for the CT modality flow), resulting in the proposed TransAttPSNN architecture. ResultsThrough extensive four-fold cross-validation experiments on the clinical PET/CT cohort. The proposed approach acquires a Dice similarity coefficient (DSC) of 0.76 ± 0.13, a Hausdorff distance (HD) of 9.38 ± 8.76 mm, and a Mean surface distance (MSD) of 1.13 ± 0.94 mm, outperforming the SOTA competing methods. The qualitative results show a satisfying consistency with the lesion areas. ConclusionsThe devised transformer-guided cross-modality adaptive feature fusion module integrates the strengths of PET and CT, effectively enhancing the segmentation performance of esophageal GTV. The proposed TransAttPSNN has further advanced the research of esophageal GTV segmentation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.