ABSTRACTIn recent years, the application of robots in the field of fruit picking has steadily increased. Mechanized methods for harvesting oil tea fruits include comb picking, vibratory picking, and gripping picking, among others. Traditional reliance on a single‐picking method is limited by variability in fruit size, shading, and environmental conditions. To develop a universal vision system suitable for picking robots capable of multiple picking methods and achieve intelligent harvesting of oil tea fruits. This paper proposes an enhanced You Only Look Once v7 (YOLOv7)‐based oil tea fruits recognition method specifically designed for subsequent clamp or comb picking. The network's feature extraction capability is enhanced by incorporating an attention mechanism, an optimized small target detection layer, and an improved training loss function, thereby improving its detection of occluded and small target fruits. An innovative Automatic Assignment (AA) method clusters and subclusters the detected oil tea fruits, providing crucial fruit distribution data to optimize the robot's picking strategy. Additionally, for vibration harvesting, this paper introduces a vibration point detection method utilizing the Pyramid Scene Parsing Network (PSPNet) semantic segmentation network combined with connectivity domain analysis to identify vibration points on the trunks and branches of oil tea trees. Experimental results demonstrate that the generalized visual detection method proposed in this study surpasses existing models in identifying oil tea fruit trees, with the enhanced YOLOv7 model achieving mean average precision, recall, and accuracy of 91.7%, 94.0%, and 94.9%, respectively. The AA method achieves clustering and subclustering of oil tea fruits with a processing delay of under 5 ms. For vibration harvesting, PSPNet achieves branch segmentation precision, recall, and intersection ratio of 97.3%, 96.5%, and 94.5%, respectively. The proposed branch vibration point detection method attains a detection accuracy of 93%, effectively pinpointing vibration points on the trunks and branches of oil tea trees. Overall, the proposed visual method can be implemented in robots using various picking techniques to enable automated harvesting.
Read full abstract