A key challenge of learning a new task is that the environment is high dimensional-there are many different sensory features and possible actions, with typically only a small reward-relevant subset. Although animals can learn to perform complex tasks that involve arbitrary associations between stimuli, actions, and rewards,1,2,3,4,5,6 a consistent and striking result across varied experimental paradigms is that in initially acquiring such tasks, large differences between individuals are apparent in the learning process.7,8,9,10,11,12 What neural mechanisms contribute to initial task acquisition, and why do some individuals learn a new task much more quickly than others? To address these questions, we recorded longitudinally from dopaminergic (DA) axon terminals in mice learning a visual decision-making task.7 Across striatum, DA responses tracked idiosyncratic and side-specific learning trajectories, consistent with widespread reward prediction error coding across DA terminals. However, even before any rewards were delivered, contralateral-side-specific visual responses were present in DA terminals, primarily in the dorsomedial striatum (DMS). These pre-existing responses predicted the extent of learning for contralateral stimuli. Moreover, activation of these terminals improved contralateral performance. Thus, the initial conditions of a projection-specific and feature-specific DA signal help explain individual learning trajectories. More broadly, this work suggests that functional heterogeneity across DA projections may serve to bias target regions toward learning about different subsets of task features, providing a potential mechanism to address the dimensionality of the initial task learning problem.
Read full abstract