Abstract

In this paper, we apply reinforcement learning(RL) to control a 3-Dof manipulator. Learning is achieved by using an on-policy temporal difference(TD) learning. In order to control the manipulator, the TD agents are in charge of the movements of the robot by adjusting each angle. We first implement the SARSA algorithm by repeatedly updating a hash-table with the current and next state-action values, and the received rewards. However, since the conventional RL algorithms have issues of the heavy computational load, we propose RL algorithms applying multi-agents approach with an assumption and vector values for states. We state the result of the classical RL algorithm and compare it with our approach. For the simulation, we test a three-link planar arm for two cases: positioning the end-effector to a fixed goal point and random goal points. The simulation results show that the robot completes the mission successfully with the improvement in the computational load.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call