Abstract

Image-based deep reinforcement learning has made great breakthrough and achievements in recent years, while unavoidably facing with the requirement of a large amount of interaction data and the problem of low training efficiency. In order to refine this problem, we propose a new method to accelerate the learning process of deep reinforcement learning by combining hierarchical encoder network with an actor-critic RL algorithm. Through making use of stacked extreme learning machines (ELMs) and training it in a supervised way, we are able to convert the high-dimensional raw image into meaningful states and send to the cascaded state-based RL algorithm. To deal with continuous state and action spaces we adopt the Cerebellar Model Articulation Controller (CMAC) network to be the critic, and take advantage of recursive least-squares TD (RLS-TD) learning method to improve the learning efficiency. Our algorithm has provided a novel useful mechanism to solve image-based end-to-end continuous control problems. Simulations on the typical cart-pole task show that our method can solve such problem efficiently and significantly improve the learning effectiveness, compared with previous deep reinforcement algorithms such as DDPG and PPO.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call