Abstract

Reinforcement learning approach has been preferred by researchers and scientists in recent years, especially due to its superior performance in robot studies. While smart systems are becoming widespread in technology that develops and changes day by day, communication problems between these systems and the environment are still among the issues that are being studied with great importance. Reinforcement learning determines what action to take in the next step by rewarding the experiences gained from the environment it is in. The most important difference from other machine learning approaches is that it does not need any preliminary data during the training phase. In this study, a deep reinforcement learning method that regulates the movements of three different robots used in limited areas is presented. The performance of the robots has been tested by training this problem with the Policy Gradient algorithm. With the presented method, it was ensured that the robots learn how to act both in the area they are in and in the collision problems against each other with the deep reinforcement learning method. Robots in this kind of mission do not only have a duty, but also need to be trained in terms of safety. In order to solve this problem, training was carried out by positioning fixed obstacles in the training environment. In this way, it performs its duty without hitting the fixed or mobile obstacles in the closed environment. This study was trained in a simulation environment. It is addressed using a concurrent learning approach without any communication between agents. A multi-agent approach based on policy gradient, temporal difference error and actor-critic methods is used. The performance of robots is reported especially by supporting the reinforcement learning approach with deep learning algorithms. Looking at the simulation results, it is seen that robots trained with the Policy Gradient algorithm are successful in cleaning the whole area. In addition, the rewards obtained by the robots at the end of the training are given in detail in section 4.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.