A problem with many traditional neural network learning algorithms is that they lack a biologically plausible method of assigning credit to the nodes in the earlier levels of the network that played a decisive role in the stimulus response mapping. In the Attention-Gated Reinforcement Learning Model (AGREL) [1] the credit assignment problem is overcome by the focal influence of attention through the feedback connections in the network. In each trial, a global reward value (conveyed by neuromodulators) is calculated and serves to increase the likelihood that successful behavior be repeated. The activity in the output level of the network (which is limited so that only a single node at a time may be active/attended) is back propagated through the network via feedback connections. The feedback connections, which are reciprocal in strength to the feedforward connections, allow the network to selectively target the lower level nodes that drove the current output, which in combination with the global reward factor, allows the network to enhance connections with nodes that caused successful behaviors, see figure 1. This attentional gating of the reward signal offers a biologically realistic solution to the credit assignment problem and in so doing provides a coherent and unifying framework for learning, with attentional selection at its core. AGREL has previously been shown to closely replicate the changes in tuning curves observed in two physiological categorization tasks [1]. In the current study, we first replicate the findings of two further physiological tasks using AGREL, and second, examine the effect of removing the feedback connections, on our models performance in these and the previous tasks.
Read full abstract