Efficient and lightweight grape and picking point synchronous detection model based on key point detection

Lixiang Huang,Huiyao Zhang,Tongtong Zhu,Aoqiang Ma,Hongwei Li,Jiqing Chen,Yang Huang

doi:10.1016/j.compag.2024.108612

Abstract

Precise positioning of fruit and picking point is crucial for harvesting table grapes using automated picking robots in an unstructured agricultural environment. Most studies employ multi-step methods for locating picking points based on fruit detection, leading to slow detection speed, cumbersome models, and algorithmic fragmentation. This study proposes an improved YOLOv8-GP (YOLOv8-Grape and picking point) model based on YOLOv8n-Pose to solve the problem of simultaneous detection of grape clusters and picking points. YOLOv8-GP is an end-to-end network that integrates object detection and key point detection. Specifically, the Bottleneck in C2f is replaced with FasterNet Block that incorporates EMA (Efficient Multi-Scale Attention), resulting in C2f-Faster-EMA. BiFPN is applied to substitute the original PAN as Neck network. The FasterNet Block, designed based on partial convolution (PConv), reduces redundant computation and memory access, thereby extracting spatial features more efficiently. The EMA attention mechanism achieves performance gains with lower computational overhead. Furthermore, BiFPN is employed to enhance the effect of feature fusion. Experimental results demonstrate that YOLOv8-GP achieves AP of 89.7 % for grape cluster detection and a Euclidean distance error of less than 30 pixels for picking point detection. Additionally, the number of Params is reduced by 47.73 %, and the model complexity GFlops is 6.1G. In summary, YOLOv8-GP offers excellent detection performance, while the reduced number of parameters and model complexity contribute to lower deployment costs and easier implementation on mobile robots.

Full Text