Abstract

Grasping control of intelligent robots has to deal with the difficulties of model uncertainties and nonlinearities. In this paper, we propose the Kernel-based Least-Squares Soft Bellman residual Actor–Critic (KLSAC) algorithm for robotic grasping. In the proposed approach, a novel linear temporal-difference learning algorithm using the least-squares soft Bellman residual (LS2BR) method is designed for policy evaluation. In addition, KLSAC adopts a sparse-kernel feature representation method based on approximate linear dependency (ALD) analysis to construct features for continuous state–action space. Compared with typical deep reinforcement learning algorithms, KLSAC has two main advantages: firstly, the critic module has the capacity for rapid convergence by computing the fixed point of the linear soft Bellman equation via the least-squares optimization method. Secondly, the kernel-based features construction approach only requires predefining the basic kernel function and can improve the generalization ability of KLSAC. The simulation studies on robotic grasping control were conducted in the V-REP simulator. The results demonstrate that compared with other typical RL algorithms (e.g., SAC and BMPO), the proposed KLSAC algorithm can achieve better performance in terms of sample efficiency and asymptotic convergence property. Furthermore, experimental results on a real UR5 robot validated that KLSAC performed well in the real world.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call