In China, waxberries are a specialty fruit that require large harvesting efforts each season. To ameliorate these issues, automated fruit-picking equipment has been widely developed within the last decade. However, most image analysis research used for waxberry segmentation contains issues such as incomplete datasets, insufficient segmentation accuracy, and susceptibility to interference. Here, we developed the Adaptive Zone-fusion Network (AZNet) that enables automated fruit harvesting and is designed to precisely segmentate waxberries in an orchard environment. Initially, we collected a diverse range of images to train and evaluate a better model, and found that AZNet demonstrated exceptional performance and achieved a mean Accuracy (mAcc) of 99.83%, mean Intersection over Union (mIoU) of 96.57%, and mean Dice coefficient (mDice) of 98.20%. To further validate AZNet’s effectiveness, we compared several state-of-the-art (SOTA) models, including DeepLab V3, SegNet, PSPNet, U-Net, RefineNet, FCN, DANet, and FastSCNN. Along with these comparisons, we also performed an ablation study to substantiate the individual impact of each module within AZNet. Through these experiments, we found that AZNet exhibits the capability to achieve accurate semantic segmentation of waxberries in complex orchard environments, and is suitable for automated fruit harvesting devices. Ultimately, AZNet improves upon previous models by constructing information fusion modules through spatial attention, channel attention, Atrous Spatial Pyramid Pooling (ASPP), and graph convolutional methods, which can be utilized in the future to enhance other fruit-picking software to reduce fruit loss and the investment of human resources.
Read full abstract