Abstract
The traditional dual manipulator control systems have not only complex motion coupling problems, but also larger computational burden, and hence it is difficult to meet the requirements of intelligent assembly. In this paper, based on multi-agent reinforcement learning theory, Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is investigated in the collaborative assembly shaft slot assembly via dual manipulator system. For the collaborative shaft slot assembly in the dual manipulator system, sparse rewards in traditional multi-agent reinforcement learning often exist because of the long sequence decision-making problem. For the above problems, this paper considers the influence of the decision-making of a single manipulator on the overall task rewards when the overall rewards of multi -agent reinforcement learning are designed. In the proposed algorithm, by calculating the difference before and after the state of each manipulator, and applying the difference as the internal state excitation to the overall task rewards, the traditional reward function of multi-agent reinforcement learning is improved. In order to verify the designed algorithm, the dual manipulator shaft slot assembly system and test scenario are established on the CoppeliaSim simulation platform. Simulation results show that the success rate of the shaft slot assembly via the improved MADDPG algorithm is about 83 % <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">*</inf>
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have