Abstract
Medical image segmentation is crucial for accurately locating lesion regions and assisting doctors in diagnosis. However, most existing methods fail to effectively utilize both local details and global semantic information in medical image segmentation, resulting in the inability to effectively capture fine-grained content such as small targets and irregular boundaries. To address this issue, we propose a novel Pyramid Fourier Deformable Network (PFD-Net) for medical image segmentation, which leverages the strengths of CNN and Transformer. The PFD-Net first utilizes PVTv2-based Transformer as the primary encoder to capture global information and further enhances both local and global feature representations with the Fast Fourier Convolution Residual (FFCR) module. Moreover, PFD-Net further proposes the Dilated Deformable Refinement (DDR) module to enhance the model’s capacity to comprehend global semantic structures of shape-diverse targets and their irregular boundaries. Lastly, Cross-Level Fusion Block with deformable convolution (CLFB) is proposed to combine the decoded feature maps from the final Residual Decoder Block (DDR) with local features from the CNN auxiliary encoder branch, improving the network’s ability to perceive targets resembling the surrounding structures. Extensive experiments were conducted on nine publicly medical image datasets for five types of segmentation tasks including polyp, abdominal, cardiac, gland cells and nuclei. The qualitative and quantitative results demonstrate that PFD-Net outperforms existing state-of-the-art methods in various evaluation metrics, and achieves the highest performance of mDice with the value of 0.826 on the most challenging dataset (ETIS), which is 1.8% improvement compared to the previous best-performing HSNet and 3.6% improvement compared to the next-best PVT-CASCADE. Codes are available at https://github.com/ChaorongYang/PFD-Net.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.