The segmentation of surface defects in citrus fruit makes a significant contribution to quality assessment and post-harvest processing. However, semantic segmentation of citrus surface defects remains challenging owing to the complexity of defect features, unclear defect boundaries and semantic gaps in models. Therefore, a superpixel driven and multiscale channel cross-fusion transformer (MCCT)-based UNet (named SPMUNet) was proposed to segment citrus surface defects. The model first created a multiscale superpixel region feature (MSRF) branch using the simple linear iterative cluster (SLIC) algorithm to form the key features embedded in the downsampling layer. Then, the proposed Superpixel Loss optimized the loss function based on superpixel grid to provide error information for boundary segmentation. In addition, the MCCT module was used to improve the skip connection layers, and feature fusion was achieved by collaborative learning. Finally, the performance of the model was validated through ablation studies, feature map visualization, and comparison experiments. The results showed that the mPA, mIoU, and dice coefficients of SPMUNet for citrus surface segmentation were 89.1%, 83.8%, and 88.6%, respectively. Compared with the baseline, the proposed MSRF branch, MCCT, and Superpixel Loss improved mIoU by 2.6%, 1.3%, and 4.2%, respectively. And the effectiveness of the proposed methods was also validated by the feature map visualization. Furthermore, the proposed SPMUNet achieved optimal segmentation performance compared to other semantic segmentation models (UNet, ResUNet++, DeepLabv3+, SwinUNet, TransUNet). SPMUNet also excelled in the segmentation of small targets and boundaries, improving by 10.2% and 7.0%, respectively, over the baseline. The improvement in segmentation performance demonstrated the effectiveness of the selected superpixel features. The proposed citrus surface defect segmentation model achieved high-precision segmentation and serves as a technical reference for citrus sorting and quality assessment.