Song acquisition behavior observed in the songbird system provides a notable example of learning through trial- and-error which parallels human speech acquisition. Studying songbird vocal learning can offer insights into mechanisms underlying human language. We present a computational model of song learning that integrates reinforcement learning (RL) and Hebbian learning and agrees with known songbird circuitry. The song circuit outputs activity from nucleus RA, which receives two primary inputs: timing information from area HVC and stochastic activity from nucleus LMAN. Additionally, song learning relies on Area X, a basal ganglia area that receives dopaminergic inputs from VTA. In our model, song is first acquired in the HVC-to-Area X connectivity, employing an RL mechanism that involves node perturbation. This information is then consolidated into HVC-to-RA synapses through a Hebbian mechanism. The transfer of weights from Area X to RA takes place via the thalamus, utilizing a specific form of spike-timing-dependent plasticity (STDP). Thus, we present a computational model grounded in songbird circuitry in which the optimal policy is initially guided by RL and subsequently transferred to another circuit through Hebbian plasticity.
Read full abstract