Reinforcement learning (RL) is the core method for interactive learning in living and artificial creatures. Nevertheless, in contrast to humans and animals, artificial RL agents are very slow in learning and suffer from the curse of dimensionality. This is partially due to using RL in isolation; i.e. lack of social learning and social diversity. We introduce a free energy-based social RL for learning novel tasks. Society is formed by the learning agent and some diverse virtual ones. That diversity is in their perception while all agents use the same interaction samples for learning and share the same action set. Individual difference in perception is mostly the cause of perceptual aliasing however, it can result in virtual agents’ faster learning in early trials. Our free energy method provides a knowledge integration method for the main agent to benefit from that diversity to reduce its regret. It rests upon Thompson sampling policy and behavioral policy of main and virtual agents. Therefore, it is applicable to a variety of tasks, discrete or continuous state space, model-free, and model-based tasks as well as to different reinforcement learning methods. Through a set of experiments, we show that this general framework highly improves learning speed and is clearly superior to previous existing methods. We also provide convergence proof.
Read full abstract