Abstract

The most basic and primary skills of a robot are pushing and grasping. In cluttered scenes, push to make room for arms and fingers to grasp objects. We propose a modified Actor-Critic (A-C) framework for deep reinforcement learning, Cross-entropy Softmax A-C (CSAC), and use the Prioritized Experience Replay (PER) based on the theoretical foundation and main methods of deep reinforcement learning, combining the advantages of algorithms based on value functions and policy gradients. The grasping model is trained using self-supervised learning to achieve end-to-end mapping from image to propulsion and grasping action. A vision module and an action module have been created out of the entire algorithm framework. The prioritized experience replay is improved to further improve the CSAC-PER algorithm for model sample diversity and robot exploration performance during robot grasping training. The experience replay buffer is dynamically sampled using the prior beta distribution and the dynamic sampling algorithm based on the beta distribution (CSAC-β) is proposed based on the CSAC algorithm. Despite its low initial efficiency, the experimental simulation results show that the CSAC-β algorithm eventually achieves good results and has a higher grasping success rate (90%).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call