Hardware-Friendly Actor-Critic Reinforcement Learning Through Modulation of Spike-Timing-Dependent Plasticity

Nan Zheng,Pinaki Mazumder

doi:10.1109/tc.2016.2595580

Nan Zheng, Pinaki Mazumder

Open Access

https://doi.org/10.1109/tc.2016.2595580

Copy DOI

Journal: IEEE Transactions on Computers	Publication Date: Feb 1, 2017
Citations: 44	License type: publisher-specific-oa

Affiliation: University of Michigan–Ann Arbor

Abstract

In this work, we propose a hardware-friendly reinforcement learning algorithm. The learning algorithm is based on an actor-critic structure implemented with spiking neural networks (SNNs). A biologically plausible and hardware-friendly spike-timing-dependent plasticity learning rule is formulated and employed in the training of SNNs. Several important aspects of applying the learning rule in a reinforcement learning context is studied, especially from the circuit designers’ point of view. Pitfalls of potential noise mixing and correlated spikes are identified and properly addressed. To feature a low-power learning architecture, techniques such as down-sampling data for certain learning blocks, injecting quantization noise as noisy residues in neurons, and proper memory partitioning are proposed. A 1-D state-value function learning problem and a 2-D maze walking problem are examined in this paper to illustrate effectiveness of the proposed algorithm and learning rules. A low-power hardware architecture is proposed and examples are implemented with Verilog. Hardware complexity of the proposed algorithm is analyzed, and potential solutions to breaking memory bottleneck when the size of the problem gets large is also discussed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Hardware-Friendly Actor-Critic Reinforcement Learning Through Modulation of Spike-Timing-Dependent Plasticity

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers

Lead the way for us

Similar Papers

Rethinking Pretraining as a Bridge From ANNs to SNNs.
Yihan Lin ... Shijie Ma
IEEE transactions on neural networks and learning systems | VOL. 35
Yihan Lin, et. al.Yihan Lin ... Shijie Ma
01 Jul 2024
IEEE transactions on neural networks and learning systems | VOL. 35

A Low-Power Actor-Critic Framework Based on Memristive Spiking Neural Network
Yaozhong Zhang ... Yue Zhou
IOP Conference Series: Earth and Environmental Science | VOL. 252
Yaozhong Zhang, et. al.Yaozhong Zhang ... Yue Zhou
01 Apr 2019
IOP Conference Series: Earth and Environmental Science | VOL. 252

Efficient training of supervised spiking neural networks via the normalized perceptron based learning rule
Xiurui Xie ... Malu Zhang
Neurocomputing | VOL. 241
Xiurui Xie, et. al.Xiurui Xie ... Malu Zhang
17 Feb 2017
Neurocomputing | VOL. 241

Simplified Spike-timing Dependent Plasticity Learning Rule of Spiking Neural Networks for Unsupervised Clustering
Hongyu Sun ... Yinjing Guo
-
Hongyu Sun, et. al.Hongyu Sun ... Yinjing Guo
01 Oct 2019
01 Oct 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hardware-Friendly Actor-Critic Reinforcement Learning Through Modulation of Spike-Timing-Dependent Plasticity

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers