A Deep Reinforcement Learning Strategy Combining Expert Experience Guidance for a Fruit-Picking Manipulator

Yuqi Liu,Ye Tian,Lijing Tian,Po Gao,Change Zheng

doi:10.3390/electronics11030311

Yuqi Liu, Ye Tian + Show 3 more

Open Access

https://doi.org/10.3390/electronics11030311

Copy DOI

Journal: Electronics	Publication Date: Jan 19, 2022
Citations: 14	License type: CC BY 4.0

Affiliation: Beijing Forestry University

Abstract

When using deep reinforcement learning algorithms for path planning of a multi-DOF fruit-picking manipulator in unstructured environments, it is much too difficult for the multi-DOF manipulator to obtain high-value samples at the beginning of training, resulting in low learning and convergence efficiency. Aiming to reduce the inefficient exploration in unstructured environments, a reinforcement learning strategy combining expert experience guidance was first proposed in this paper. The ratios of expert experience to newly generated samples and the frequency of return visits to expert experience were studied by the simulation experiments. Some conclusions were that the ratio of expert experience, which declined from 0.45 to 0.35, was more effective in improving learning efficiency of the model than the constant ratio. Compared to an expert experience ratio of 0.35, the success rate increased by 1.26%, and compared to an expert experience ratio of 0.45, the success rate increased by 20.37%. The highest success rate was achieved when the frequency of return visits was 15 in 50 episodes, an improvement of 31.77%. The results showed that the proposed method can effectively improve the model performance and enhance the learning efficiency at the beginning of training in unstructured environments. This training method has implications for the training process of reinforcement learning in other domains.

Highlights

Automatic fruit-picking systems based on a multi-DOF manipulator have become a major direction in fruit harvesting in order to increase efficiency and reduce production costs [1]
Chun proposed a deep reinforcement learning algorithm framework that combined the advantages of convolutional neural network (CNN) and deep deterministic policy gradient (DDPG) algorithms to solve how to use delivery task information and automated guided vehicles (AGVs) travel time in the problem of dynamic scheduling of AGV [17]
Deep Reinforcement Learning Strategies with Expert Experience In the early stages of fruit picking, the complexity and disorder of the target locations and the fact that the network parameters are randomly generated at the initial stage make the model inefficient and difficult for the network to converge during the training process

Summary

Introduction

Automatic fruit-picking systems based on a multi-DOF (degree of freedom) manipulator have become a major direction in fruit harvesting in order to increase efficiency and reduce production costs [1]. Zheng et al [25] proposed a deep deterministic policy gradient algorithm based on a stepwise migration strategy, which introduced spatial constraints for stepwise training in an obstacle-free environment, speeding up the network convergence, after which the obtained prior knowledge was used to guide the path planning task of a multi-DOF manipulator in a complex unstructured environment. A deep reinforcement learning strategy combined with expert experience was proposed to improve [28] the learning efficiency of the algorithm at the beginning of training period and reduce the blind exploration of the multi-DOF manipulator. Deep Reinforcement Learning Strategies with Expert Experience In the early stages of fruit picking, the complexity and disorder of the target locations and the fact that the network parameters are randomly generated at the initial stage make the model inefficient and difficult for the network to converge during the training process. FFigiguurere33. .PPicickkininggsscceennee. .OOnnththeeleleffttssidideeisisaannaappppleletrtereeemmooddeel,l,ininwwhhicichhrereddspsphheereressininddicicaateterirpipee aapppplelessananddgrgereenenspshpehreerseisndinidcaicteatuenurinpreipaeppaplepsl;eosn; othnetrhieghritgshidtesiids ea ims ualmti-uDlOti-FDpOicFkpinigckminagnimpuanlaitpourfilxaetodrofnixteodpoonf taomp oofbialempolabtifleorpmla.tform

The Multi-DOF Manipulator

Model of Untargeted Fruit

The Impact of Different Amounts of Expert Experience

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Deep Reinforcement Learning Strategy Combining Expert Experience Guidance for a Fruit-Picking Manipulator

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

Mobile Robot Path Planning Based on Improved DDPG Reinforcement Learning Algorithm
Yuansheng Dong ... Xingjie Zou
-
Yuansheng Dong, et. al.Yuansheng Dong ... Xingjie Zou
16 Oct 2020
16 Oct 2020

SUNFISH®: A human-portable exploration AUV for complex 3D environments
Kristof Richmond ... Laura Lindzey
-
Kristof Richmond, et. al.Kristof Richmond ... Laura Lindzey
01 Oct 2018
01 Oct 2018

Learning Pushing Skills Using Object Detection and Deep Reinforcement Learning
Wei Guo ... Mantian Li
-
Wei Guo, et. al.Wei Guo ... Mantian Li
01 Aug 2019
01 Aug 2019

Guided Deep Reinforcement Learning for Path Planning of Robotic Manipulators
Yue Shen ... Qingxuan Jia
-
Yue Shen, et. al.Yue Shen ... Qingxuan Jia
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Deep Reinforcement Learning Strategy Combining Expert Experience Guidance for a Fruit-Picking Manipulator

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics