The UNet series networks have been a leader in the field of medical image segmentation since their introduction. However, encoder and decoder structures of the traditional UNet series network are complex, with a large number of parameters and floating-point operations. This requires a large amount of data as support for model training, but most medical datasets only contain limited numbers of samples. To address this issue, we propose a global frequency domain UNet (GFUNet), a novel architecture for fast medical image segmentation. Inspired by recent modified Multi-Layer Perceptron(MLP)-like models, we combine Fourier Transform with UNet structure to achieve more efficient and effective encoding and decoding processes. Meanwhile, A dual-domain encoding module is designed to improve the performance of the encoder and decoder by fully used frequency domain feature. Furthermore, due to the excellent property of the Fourier Transform and its optimization, our network greatly reduces the number of parameters compared to other UNets. We evaluate GFUNet on several medical segmentation tasks, achieving improved segmentation performance compared to state-of-the-art network architectures for medical image segmentation. Compared to the original UNet, the results show that we reduce the number of parameters by 46 times, reduce computational complexity by 114 times, and improved the considerable dice score.
Read full abstract