Abstract

Spiking neural networks (SNNs) offer many advantages over traditional artificial neural networks (ANNs) such as biological plausibility, fast information processing, and energy efficiency. Although SNNs have been used to solve a variety of control tasks using the Spike-Timing-Dependent Plasticity (STDP) learning rule, existing solutions usually involve hard-coded network architectures solving specific tasks rather than solving different kinds of tasks generally. This results in neglecting one of the biggest advantages of ANNs, i.e., being general-purpose and easy-to-use due to their simple network architecture, which usually consists of an input layer, one or multiple hidden layers and an output layer. This paper addresses the problem by introducing an end-to-end learning approach of spiking neural networks constructed with one hidden layer and reward-modulated Spike-Timing-Dependent Plasticity (R-STDP) synapses in an all-to-all fashion. We use the supervised reward-modulated Spike-Timing-Dependent-Plasticity learning rule to train two different SNN-based sub-controllers to replicate a desired obstacle avoiding and goal approaching behavior, provided by pre-generated datasets. Together they make up a target-reaching controller, which is used to control a simulated mobile robot to reach a target area while avoiding obstacles in its path. We demonstrate the performance and effectiveness of our trained SNNs to achieve target reaching tasks in different unknown scenarios.

Highlights

  • We demonstrate the capabilities and efficiencies of our proposed end-to-end learning of spiking neural network based on supervised R-Spike-Timing-Dependent Plasticity (STDP) by performing a target-reaching vehicle

  • Teaching a brain-inspired spiking neural network in a general and easy way is not simple. We tackled this problem by proposing an end-to-end learning rule based on the supervised reward-modulated Spike-Timing-Dependent Plasticity (R-STDP) rule and used it for training two spiking neural networks (SNNs) for an autonomous targettracking implementation

  • By changing the inputs fed into the network and slightly changing the way that the reward was assigned to the output neurons, two SNNs were trained to learn to exhibit the desired behavior successfully and the robot was able to reach a previously set target area while avoiding obstacles

Read more

Summary

INTRODUCTION

Despite the success of traditional artificial neural networks (ANNs) in learning complex non-linear functions, the interest of spiking neural networks (SNNs) is steadily increasing due to the fact that SNNs offer many fundamental and inherent advantages over traditional ANNs, such as biological plausibility (Maass, 1997), rapid information processing (Thorpe et al, 2001; Wysoski et al, 2010), and energy efficiency (Drubach, 2000; Cassidy et al, 2014). Mobile robots will be able to manage their weaknesses of carrying limited computing resources and power supply based on SNN-based controllers Training these kinds of networks is notoriously difficult, since the error back-propagation mechanism commonly used in conventional neural networks cannot be directly transferred to SNNs due to the non-differentiability at spike times. Neuroscience studies reveal that the brain modifies the outcome of STDP synapses using one or more chemicals emitted by given neurons This mechanism inspires a new method for training SNNs and is known as reward-modulated spike-timing-dependent plasticity (R-STDP) (Izhikevich, 2007). The SNN with transferred policy can in return control the robot in an energy-efficient way, which can be achieved by running on a neuromorphic hardware (Blum et al, 2017) To this end, our paper looks to explore an indirect SNN training approach based on the R-STDP learning rule and supervised learning framework. All of our codes and experiment demos can be found in the Supplementary Files

RELATED WORK
MODELING OF SPIKING NEURAL NETWORK
Network Model
Supervised R-STDP Learning Rule
REFERENCE DATASET
Obstacle-Avoiding Dataset
Goal-Approaching Dataset
Calculating the Pioneer-Robot Motor Speeds
Target-Reaching Control Structure
Goal-Approaching Sub-controller
Obstacle Avoiding Sub-controller
RESULTS AND DISCUSSION
Testing Environments
Obstacle-Avoiding Sub-controller
Overall Performance
DISCUSSIONS AND LIMITATIONS
CONCLUSION AND FUTURE WORK
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.