Peduncle collision-free grasping based on deep reinforcement learning for tomato harvesting robot

Yajun Li,Qingchun Feng,Yifan Zhang,Chuanlang Peng,Yuhang Ma,Cheng Liu,Mengfei Ru,Jiahui Sun,Chunjiang Zhao

doi:10.1016/j.compag.2023.108488

Abstract

Collision-free grasping of the thin, brief peduncles connecting cherry tomato clusters to the main stem was crucial for tomato harvesting robots. Recognizing that the optimal operating posture for each individual peduncle was various, this study proposed a novel peduncle grasping posture decision model using deep reinforcement learning (DRL) for tomato harvesting manipulators, to overcome the collision issue caused by fixed-posture grasping. This model could dynamically generated action sequences for the harvesting manipulator, ensuring that the end-effector approach to the peduncle along the collision-free path with the optimal grasping posture. Building upon prior research into the multi-task identification of tomato clusters, peduncles, and the main stem, a keypoint-based spatial pose description model for tomato bunches was devised. Through this, the optimal operating posture for the end-effector on the peduncle was established. An improved HER-SAC (Soft Actor Critic with Hindsight Experience Replay) algorithm was subsequently established to guide the end-effector in collision-free grasping motions. The reward function of this algorithm incorporated end-effector posture constraints obtained from the optimal posture plane. In the training phase, a heuristic strategy model, providing prior knowledge, was merged with a dynamic gain module to sidestep local optimal policies, collectively enhancing the learning efficiency. In the simulation, our method improved the success rate of the peduncle grasping by at least 14 %, compared with SAC, HER-DDPG and HER-TD3. For the identical scenarios, improved HER-SAC reached the desired posture with a minimum of 15.5 % fewer steps compared to other algorithms. In field experiments conducted in tomato greenhouses, the robot achieved a harvesting success rate of 85.5 %, which was an increase of 57.3 % and 43.0 % compared to traditional methods with fixed horizontal and parallel-to-main-stem postures, respectively. The average operation time, from identification to successful harvesting, was 11.42 s. Our findings offer a promising solution to enhancing the efficiency of tomato-harvesting robots.

Full Text