Abstract

This paper presents a modified R-learning according to the traditional average reward reinforcement learning algorithm. Reinforcement learning problems constitute an important class of learning and control problems faced by artificial intelligence systems. The general framework of reinforcement learning can be divided into two forms, discounted reward reinforcement learning and average reward reinforcement learning. R-learning is a model-free average reward reinforcement learning algorithm. Comparing with the conventional R-learning algorithm, this paper undertakes a detailed examination of the improvement of the R-learning, by adding the directing reward function with punitive mechanism and the exploration strategy based on the roulette technique. As the result of this design, agent can gain more information in every learning step. Through applying the improved R-learning to Robocup Simulation League (2D) and making comparison with the Q-learning, empirical results show that the learning efficiency has been increased.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call