Abstract

Reinforcement learning (RL) is one kind of interactive learning methods. Its main characteristics are “trial and error” and “related reward.” A hierarchical reinforcement learning method based on action subrewards is proposed to solve the problem of “curse of dimensionality,” which means that the states space will grow exponentially in the number of features and low convergence speed. The method can reduce state spaces greatly and choose actions with favorable purpose and efficiency so as to optimize reward function and enhance convergence speed. Apply it to the online learning in Tetris game, and the experiment result shows that the convergence speed of this algorithm can be enhanced evidently based on the new method which combines hierarchical reinforcement learning algorithm and action subrewards. The “curse of dimensionality” problem is also solved to a certain extent with hierarchical method. All the performance with different parameters is compared and analyzed as well.

Highlights

  • Reinforcement learning (RL) is one kind of interactive learning methods

  • The main ideas used to solve this problem are abstraction and approximation. All these ideas can be divided into four kinds of methods: spaces clustering algorithms, finite spaces search methods, value functions approximation methods, and hierarchical reinforcement learning (HRL) [1]

  • In HRL method, if the agent is only concerned about the current local space change or the subtarget state change, the update process of the policy will be restricted in the local space or highlevel space, so that you can speed up the learning process and reduce the dependence of the algorithm’s convergence speed with the environmental changes

Read more

Summary

Introduction

Reinforcement learning (RL) is one kind of interactive learning methods. Its main characteristics are “trial and error” and “related reward.” In the whole learning process, the agent can communicate with environment to get rewards and improves actions according to these rewards. The main ideas used to solve this problem are abstraction and approximation All these ideas can be divided into four kinds of methods: spaces clustering algorithms, finite spaces search methods, value functions approximation methods, and hierarchical reinforcement learning (HRL) [1]. The key problems solved in this paper are “curse of dimensionality” problems in large discrete state spaces of infinite repetition tasks and the reward functions optimization in convergence speed enhancement of algorithms such as Q-learning. The experiment result shows that both of the problems mentioned above can be solved to a certain extent

Reinforcement Learning
Function Optimization Based on Subrewards in Hierarchical RL
Experiment and Analyses
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call