Abstract

Reinforcement learning is a commonly used paradigm for learning in autonomous dynamical systems. A popular topic in this field is how to extend the RL, into the continuous state space and action space, so that RL can be applied to more real world problems. The ASE/ACE which is one of the most famous implementations of RL, shows the possibility to be one solution. However the convergence of the scheme is proved to be slower than the method based on discrete state and action space, such as Q-learning methods. The reason is clear, since the continuous state and action need to be organized to reduce the indefinite searching to definite. On the other hand, there exists few RL systems exploring the action space by combining the effective action sequences to catch the regularity of the environment and thus to be reusable. We add a memory-based sequence structure, and correspondingly an adaptive action sequence critic to the actor/critic architecture to organize the action space, By generating and organizing the action sequences in continuous action space, the new model is able to improve the learning speed and acquire the environment-oriented skill. Experiments to solve a bench-mark double integrator problem and a 2-dimensional complicated problem are carried out to show the effectiveness of the new model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.