Classical models of cerebellar computation posit that climbing fibers (CFs) operate according to supervised learning rules, correcting movements by signaling the occurrence of motor errors. However, recent findings suggest that in some behaviors, CF activity can exhibit features that resemble the instructional signals necessary for reinforcement learning, namely reward prediction errors (rPEs). Despite these initial observations, many key properties of reward-related CF responses remain unclear, thus limiting our understanding of how they operate to guide cerebellar learning. Here, we have measured the postsynaptic responses of CFs onto cerebellar Purkinje cells using two-photon calcium imaging to test how they respond to learned stimuli that either do or do not predict reward. We find that CFs can develop generalized responses to similar cues of the same modality, regardless of whether they are reward predictive. However, this generalization depends on temporal context, and does not extend across sensory modalities. Further, learned CF responses are flexible, and can be rapidly updated according to new reward contingencies. Together these results suggest that CFs can generate learned, reward-predictive responses that flexibly adapt to the current environment in a context-sensitive manner.
Read full abstract