Modular Neural Networks for Reinforcement Learning with Temporal Intrinsic Rewards

Johane Takeuchi,Osamu Shouno,Hiroshi Tsujino

doi:10.1109/ijcnn.2007.4371120

Abstract

Inspired by intrinsic motivation that is thought to play a crucial role in animal development and learning, several artificial learning systems with built in intrinsic rewards were recently studied. Here we suggest an intrinsically rewarded learning system for autonomous task achievements that copes with several kinds of transitions. The system consists of neural networks equipped with a modular reinforcement learning algorithm. The modular system that decomposes the observed state space stabilizes the intrinsic rewards calculated from prediction errors. On-line learning via the proposed system takes place under various kinds of transitions, including deterministic, probabilistic and partially observable, without any specific adjustments of parameters for each transition. The combined system with both the modular network and the intrinsic reward generator led to performance that converged to the optimal sequences of actions in all tested transitions, in which external rewards were delivered only at the completion of tasks.

Full Text