Supervised Learning in SNN via Reward-Modulated Spike-Timing-Dependent Plasticity for a Target Reaching Vehicle.

Zhenshan Bing,Zhuangyi Jiang,Caixia Cai,Ivan Baumann,Alois Knoll,Kai Huang

doi:10.3389/fnbot.2019.00018

Abstract

Spiking neural networks (SNNs) offer many advantages over traditional artificial neural networks (ANNs) such as biological plausibility, fast information processing, and energy efficiency. Although SNNs have been used to solve a variety of control tasks using the Spike-Timing-Dependent Plasticity (STDP) learning rule, existing solutions usually involve hard-coded network architectures solving specific tasks rather than solving different kinds of tasks generally. This results in neglecting one of the biggest advantages of ANNs, i.e., being general-purpose and easy-to-use due to their simple network architecture, which usually consists of an input layer, one or multiple hidden layers and an output layer. This paper addresses the problem by introducing an end-to-end learning approach of spiking neural networks constructed with one hidden layer and reward-modulated Spike-Timing-Dependent Plasticity (R-STDP) synapses in an all-to-all fashion. We use the supervised reward-modulated Spike-Timing-Dependent-Plasticity learning rule to train two different SNN-based sub-controllers to replicate a desired obstacle avoiding and goal approaching behavior, provided by pre-generated datasets. Together they make up a target-reaching controller, which is used to control a simulated mobile robot to reach a target area while avoiding obstacles in its path. We demonstrate the performance and effectiveness of our trained SNNs to achieve target reaching tasks in different unknown scenarios.

Highlights

We demonstrate the capabilities and efficiencies of our proposed end-to-end learning of spiking neural network based on supervised R-Spike-Timing-Dependent Plasticity (STDP) by performing a target-reaching vehicle
Teaching a brain-inspired spiking neural network in a general and easy way is not simple. We tackled this problem by proposing an end-to-end learning rule based on the supervised reward-modulated Spike-Timing-Dependent Plasticity (R-STDP) rule and used it for training two spiking neural networks (SNNs) for an autonomous targettracking implementation
By changing the inputs fed into the network and slightly changing the way that the reward was assigned to the output neurons, two SNNs were trained to learn to exhibit the desired behavior successfully and the robot was able to reach a previously set target area while avoiding obstacles

Summary

INTRODUCTION

Despite the success of traditional artificial neural networks (ANNs) in learning complex non-linear functions, the interest of spiking neural networks (SNNs) is steadily increasing due to the fact that SNNs offer many fundamental and inherent advantages over traditional ANNs, such as biological plausibility (Maass, 1997), rapid information processing (Thorpe et al, 2001; Wysoski et al, 2010), and energy efficiency (Drubach, 2000; Cassidy et al, 2014). Mobile robots will be able to manage their weaknesses of carrying limited computing resources and power supply based on SNN-based controllers Training these kinds of networks is notoriously difficult, since the error back-propagation mechanism commonly used in conventional neural networks cannot be directly transferred to SNNs due to the non-differentiability at spike times. Neuroscience studies reveal that the brain modifies the outcome of STDP synapses using one or more chemicals emitted by given neurons This mechanism inspires a new method for training SNNs and is known as reward-modulated spike-timing-dependent plasticity (R-STDP) (Izhikevich, 2007). The SNN with transferred policy can in return control the robot in an energy-efficient way, which can be achieved by running on a neuromorphic hardware (Blum et al, 2017) To this end, our paper looks to explore an indirect SNN training approach based on the R-STDP learning rule and supervised learning framework. All of our codes and experiment demos can be found in the Supplementary Files

RELATED WORK

MODELING OF SPIKING NEURAL NETWORK

Network Model

Supervised R-STDP Learning Rule

REFERENCE DATASET

Obstacle-Avoiding Dataset

Goal-Approaching Dataset

Calculating the Pioneer-Robot Motor Speeds

Target-Reaching Control Structure

Goal-Approaching Sub-controller

Obstacle Avoiding Sub-controller

RESULTS AND DISCUSSION

Testing Environments

Obstacle-Avoiding Sub-controller

Overall Performance

DISCUSSIONS AND LIMITATIONS

CONCLUSION AND FUTURE WORK

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Neurorobotics	Publication Date: May 3, 2019
Citations: 29	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Supervised Learning in SNN via Reward-Modulated Spike-Timing-Dependent Plasticity for a Target Reaching Vehicle.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Neurorobotics

Lead the way for us

Similar Papers

Mapping spatio-temporally encoded patterns by reward-modulated STDP in Spiking neurons
Ibrahim Ozturk ... David M Halliday
-
Ibrahim Ozturk, et. al.Ibrahim Ozturk ... David M Halliday
01 Dec 2016
01 Dec 2016

SPENSER: Towards a NeuroEvolutionary Approach for Convolutional Spiking Neural Networks
Henrique Branquinho ... Ernesto Costa
-
Henrique Branquinho, et. al.Henrique Branquinho ... Ernesto Costa
15 Jul 2023
15 Jul 2023

End to End Learning of Spiking Neural Network Based on R-STDP for a Lane Keeping Vehicle
Zhenshan Bing ... Alois Knoll
-
Zhenshan Bing, et. al.Zhenshan Bing ... Alois Knoll
01 May 2018
01 May 2018

First-Spike-Based Visual Categorization Using Reward-Modulated STDP.
Milad Mozafari ... Mohammad Ganjtabesh
IEEE Transactions on Neural Networks and Learning Systems | VOL. 29
Milad Mozafari, et. al.Milad Mozafari ... Mohammad Ganjtabesh
08 May 2018
IEEE Transactions on Neural Networks and Learning Systems | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Supervised Learning in SNN via Reward-Modulated Spike-Timing-Dependent Plasticity for a Target Reaching Vehicle.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Neurorobotics