Abstract

This paper describes a policy transfer method of a reinforcement learning agent based on the spreading activation model of cognitive psychology. This method has a prospect of increasing the possibility of policy reuse, adapting to multiple tasks, and assessing agent mechanism differences. In the existing methods, policies are evaluated and manually selected depending on the target–task. The proposed method generates a policy network that calculates the relevance between policies in order to select and transfer a specific policy that is presumed to be effective based on the current situation of the agent while learning. Using a policy network graph structure, the proposed method decides the most effective policy while repeating probabilistic selection, activation, and spread processing. In the experiment section, this study describes experiments conducted to evaluate usefulness, conditions of use, and the usable range of the proposed method. Tests using CartPole and MountainCar, which are classical reinforcement learning tasks, are described and transfer learning is compared between the proposed method and a Deep Q–Network without transfer. As the experimental results, usefulness was suggested in the transfer learning of the same task without manual compared with previous method with various conditions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.