Abstract

Temporal abstraction and exploration are both very important factors to determine the performance in reinforcement learning. The author has proposed to focus on the deterministic exploration behavior that is obtained through reinforcement learning. In this paper, a novel idea that deterministic exploration behavior can be considered as temporally abstract actions or macro actions was introduced. It was actually shown in some simulations that the deterministic exploration behavior obtained through the learning of a task accelerates the learning of another similar task without any definition of abstract actions. A recurrent neural network was used for the learning, but the knowledge obtained through the first learning was used effectively in the second learning without being destroyed completely even though it did not work in a more difficult task. Furthermore, when the agent was returned to the first task, the learning was still faster than the learning from scratch. An interesting phenomenon was observed in the simulation that context-based exploration behavior was acquired through the learning of a task that did not require such behavior

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call