Deciding whether to forego immediate rewards or explore new opportunities is a key component of flexible behavior and is critical for the survival of the species. Although previous studies have shown that different cortical and subcortical areas, including the amygdala and ventral striatum (VS), are implicated in representing the immediate (exploitative) and future (explorative) value of choices, the effect of the motor system used to make choices has not been examined. Here, we tested male rhesus macaques with amygdala or VS lesions on two versions of a three-arm bandit task where choices were registered with either a saccade or an arm movement. In both tasks we presented the monkeys with explore-exploit tradeoffs by periodically replacing familiar options with novel options that had unknown reward probabilities. We found that monkeys explored more with saccades but showed better learning with arm movements. VS lesions caused the monkeys to be more explorative with arm movements and less explorative with saccades, although this may have been due to an overall decrease in performance. VS lesions affected the monkeys' ability to learn novel stimulus-reward associations in both tasks, while after amygdala lesions this effect was stronger when choices were made with saccades. Further, on average, VS and amygdala lesions reduced the monkeys' ability to choose better options only when choices were made with a saccade. These results show that learning reward value associations to manage explore-exploit behaviors is motor system dependent and they further define the contributions of amygdala and VS to reinforcement learning.
Read full abstract