For precise quantification of the stomatal phenotype across rape, automatic detection and segmentation of stomata from microscopic images is crucial. However, this task poses several challenges: (1) Infrequent: The stomata of rape are domain-specific objects, rendering pre-trained feature extractors from transfer learning unreliable; (2) Irregular: The detection and segmentation of stomata is complicated due to disparities in their shape, size, and tilt angle; (3) Imbalanced: The number of samples in detection and segmentation task suffer from imbalance issues between low-quality/high-quality bounding boxes and foreground/background pixels, respectively. In this research, a novel multi-task model named I3-YOLOv8s is proposed, aiming at detecting and segmenting stomata of rape during its bolting stage. Specifically, for the Infrequent problem, a self-supervised learning method based on masked image reconstruction is designed to pre-train domain-specific backbone network; then, for the Irregular problem, a CA block based on the coordinate attention mechanism is developed in the multi-scale neck network; finally, for the Imbalanced problem, a novel loss function is proposed in the decoupled head based on the focal EIoU&focal loss. Experimental results indicate that, the proposed I3-YOLOv8s achieves an F1 score of 93.29 % and a single image inference delay of 14.1 ms for detection; its F1 score is 92.51 % and a single image inference delay of 14.8 ms for segmentation. The I3-YOLOv8s achieves the state-of-the-art performance and an optimal trade-off between accuracy and speed. Experimental analyses further substantiate the efficacy of each module, and attest to the dependability of implementing I3-YOLOv8s on edge computing devices for agricultural production.
Read full abstract