Abstract
This paper presents a modified R-learning according to the traditional average reward reinforcement learning algorithm. Reinforcement learning problems constitute an important class of learning and control problems faced by artificial intelligence systems. The general framework of reinforcement learning can be divided into two forms, discounted reward reinforcement learning and average reward reinforcement learning. R-learning is a model-free average reward reinforcement learning algorithm. Comparing with the conventional R-learning algorithm, this paper undertakes a detailed examination of the improvement of the R-learning, by adding the directing reward function with punitive mechanism and the exploration strategy based on the roulette technique. As the result of this design, agent can gain more information in every learning step. Through applying the improved R-learning to Robocup Simulation League (2D) and making comparison with the Q-learning, empirical results show that the learning efficiency has been increased.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.