Spiking neural network model of free-energy-based reinforcement learning

Takashi Nakano,Makoto Otsuka

doi:10.1186/1471-2202-12-s1-p244

Takashi Nakano, Makoto Otsuka

Open Access

https://doi.org/10.1186/1471-2202-12-s1-p244

Copy DOI

Abstract

Reinforcement learning is a theoretical framework for learning how to act in an unknown environment through trial and errors. One reinforcement learning framework proposed by Sallans and Hinton [1], which we call free-energy-based reinforcement learning (FERL), possesses many desirable characteristics such as an ability to deal with high-dimensional sensory inputs and goal-directed representation learning, and neurally plausible characteristics such as population coding of action-value and a Hebbian learning rule modulated by reward prediction errors. These characteristics imply that FERL is possibly implemented in the brain. In order to understand the neural implementation of the reinforcement learning and pursue the neural plausibility of FERL, we implemented FERL in a more realistic spiking neural network than binary stochastic neurons. An FERL framework uses a restricted Boltzmann machine (RBM) as a building block. The RBM is an energy-based statistical model with binary nodes separated in visible and hidden layers. In the RBM, due to its connectivity, the posterior distribution over hidden given visible nodes is statistically decoupled, yielding the simple computation of posterior distribution [2]. An RBM is implemented using a spiking neural network with leaky integrate- and-fire neurons. The network is composed of state, action, and hidden layers. The state and action layers consist of several modules (neuron groups) associated with certain states and actions. All state neurons are unidirectionally connected to all hidden neurons. Action neurons are bidirectionally connected to hidden neurons to reflect the selected action to the hidden activations. The action-values, are approximated by the negative free-energy, can be approximated by the firing of the hidden neurons. All connection weights are updated by a Hebbian learning rule and reward prediction error. The agent takes action based on the activation of action neurons. Our spiking neural network solved reinforcement learning tasks with both low- and high-dimensional observation. All desirable characteristics in FERL framework were preserved in this extension. In both cases, the negative free-energy shows proper representation of the action-values. The free-energies estimated by the spiking neural network have high correlation with one estimated by the original RBM. Activation patterns of hidden neurons reflect the goal-oriented action-based category after reward-based learning (Figure (Figure11). Figure 1 Performance of the spiking neural network. A. The free-energies estimated by both the spiking neural network and the original RBM. They are highly correlated (correlation coefficient, r = 0.9485) B. The hidden neurons activation on the two principal components. ...

Highlights

Reinforcement learning is a theoretical framework for learning how to act in an unknown environment through trial and errors
The network is composed of state, action, and hidden layers
* Correspondence: nakano@oist.jp Okinawa Institute of Science and Technology, Onna, Okinawa 904-0412, Japan and action layers consist of several modules associated with certain states and actions

Summary

Introduction

Reinforcement learning is a theoretical framework for learning how to act in an unknown environment through trial and errors. An FERL framework uses a restricted Boltzmann machine (RBM) as a building block. The RBM is an energy-based statistical model with binary nodes separated in visible and hidden layers. In the RBM, due to its connectivity, the posterior distribution over hidden given visible nodes is statistically decoupled, yielding the simple computation of posterior distribution [2].

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Neuroscience	Publication Date: Jul 18, 2011
Citations: 2	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Spiking neural network model of free-energy-based reinforcement learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Neuroscience

Lead the way for us

Similar Papers

A spiking neural network model of model-free reinforcement learning with high-dimensional sensory input and perceptual ambiguity.
Takashi Nakano ... Makoto Otsuka
PloS one | VOL. 10
Takashi Nakano, et. al.Takashi Nakano ... Makoto Otsuka
03 Mar 2015
PloS one | VOL. 10

Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons
Nicolas Frémaux ... Wulfram Gerstner
PLoS Computational Biology | VOL. 9
Nicolas Frémaux, et. al.Nicolas Frémaux ... Wulfram Gerstner
11 Apr 2013
PLoS Computational Biology | VOL. 9

Free-Energy Based Reinforcement Learning for Vision-Based Navigation with High-Dimensional Sensory Inputs
Stefan Elfwing ... Kenji Doya
-
Stefan Elfwing, et. al.Stefan Elfwing ... Kenji Doya
01 Jan 2009
01 Jan 2009

Author response: On the normative advantages of dopamine and striatal opponency for learning and choice
Alana Jaskir ... Michael J Frank
-
Alana Jaskir, et. al.Alana Jaskir ... Michael J Frank
14 Feb 2023
14 Feb 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spiking neural network model of free-energy-based reinforcement learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Neuroscience