Gathering information is crucial for maximizing fitness, but requires diverting resources from searching directly for primary rewards to actively exploring the environment. Optimal decision-making thus maximizes information while reducing effort costs, but little is known about the neuro-computational implementation of this tradeoff. We present a Reinforcement Meta-Learning (RML) computational model that solves the trade-off between the value and costs of gathering information. We implement the RML in a biologically plausible architecture that links catecholaminergic neuromodulators, the medial prefrontal cortex and topographically organized visual maps and show that it accounts for neural and behavioral findings on information demand motivated by instrumental incentives and intrinsic utility. Moreover, the utility function used by the RML, encoded by dopamine, is an approximation of variational free energy. Thus, the RML presents a biologically plausible mechanism for coordinating motivational, executive and sensory systems generate visual information gathering policies that minimize free energy.
Read full abstract