Objective. In recent years, convolutional neural networks, which typically focus on extracting spatial domain features, have shown limitations in learning global contextual information. However, frequency domain can offer a global perspective that spatial domain methods often struggle to capture. To address this limitation, we propose FreqSNet, which leverages both frequency and spatial features for medical image segmentation. Approach. To begin, we propose a frequency-space representation aggregation block (FSRAB) to replace conventional convolutions. FSRAB contains three frequency domain branches to capture global frequency information along different axial combinations, while a convolutional branch is designed to interact information across channels in local spatial features. Secondly, the multiplex expansion attention block extracts long-range dependency information using dilated convolutional blocks, while suppressing irrelevant information via attention mechanisms. Finally, the introduced Feature Integration Block enhances feature representation by integrating semantic features that fuse spatial and channel positional information. Main results. We validated our method on 5 public datasets, including BUSI, CVC-ClinicDB, CVC-ColonDB, ISIC-2018, and Luna16. On these datasets, our method achieved Intersection over Union (IoU) scores of 75.46%, 87.81%, 79.08%, 84.04%, and 96.99%, and Hausdorff distance values of 22.22 mm, 13.20 mm, 13.08 mm, 13.51 mm, and 5.22 mm, respectively. Compared to other state-of-the-art methods, our FreqSNet achieves better segmentation results. Significance. Our method can effectively combine frequency domain information with spatial domain features, enhancing the segmentation performance and generalization capability in medical image segmentation tasks.
Read full abstract