Abstract

Abstract Episodic memory plays important role in animal behavior. It allows to reuse general skills for solution of specific tasks in changing environment. This beneficial feature of biological cognitive systems is still not incorporated successfully in an artificial neural architectures. In this paper we propose a neural architecture with shared episodic memory for multi-task reinforcement learning (SEM-PAAC). This architecture extends Parallel Advantage Actor Critic (PAAC) with two recurrent sub-networks for separate tracking of environment and task states. The first subnetwork store episodic memory and the second one allows task specific execution of policy. Experiments in the Taxi domain demonstrated that SEM-PAAC has the same performance as PAAC when subtasks are solved separately. On the other hand when subtasks are solved jointly for completing full Taxi task SEM-PAAC is significantly better due to reuse of episodic memory. Proposed architecture also successfully learned to predict task completion. This is a step towards more autonomous agents for multitask problems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call