Abstract

For task-nonspecific robot work in changing environment, most researches on developmental learning based on cognitive psychology advocate a staged developmental process and an explicit hierarchical action model. Though progress has been made by existing developmental learning approaches, there are still two open-ended inevitable problems: a) when numerous tasks are involved, the learning speed is not always satisfactory; b) when these tasks are not specified in advance, the hierarchical action model is hard to design beforehand or learn automatically. In order to solve these two problems, this paper proposes a new developmental reinforcement learning approach presented with its model and algorithms. In our model, any one of actor-critic learning models is encapsulated as a learning infrastructure to build an implicit action model called reward-policy mapping, and a self-motivated module is used for autonomous robots. The proposed approach efficaciously supports the implementation of an autonomous, interactive, cumulative and online learning process of task-nonspecific robots. The simulation results show that, to learn to perform nearly twenty thousand tasks, the proposed approach just needs half of the time that its counterpart, the actor-critic learning algorithm encapsulated, needs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.