Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation.

Deboleena Roy,Priyadarshini Panda,Kaushik Roy

doi:10.3389/fnins.2019.00621

Deboleena Roy, Priyadarshini Panda + Show 1 more

Open Access

https://doi.org/10.3389/fnins.2019.00621

Copy DOI

Journal: Frontiers in neuroscience	Publication Date: Jun 18, 2019
Citations: 13	License type: CC BY 4.0

Affiliation: Purdue University West Lafayette

Abstract

Spiking neural networks (SNNs) offer a promising alternative to current artificial neural networks to enable low-power event-driven neuromorphic hardware. Spike-based neuromorphic applications require processing and extracting meaningful information from spatio-temporal data, represented as series of spike trains over time. In this paper, we propose a method to synthesize images from multiple modalities in a spike-based environment. We use spiking auto-encoders to convert image and audio inputs into compact spatio-temporal representations that is then decoded for image synthesis. For this, we use a direct training algorithm that computes loss on the membrane potential of the output layer and back-propagates it by using a sigmoid approximation of the neuron's activation function to enable differentiability. The spiking autoencoders are benchmarked on MNIST and Fashion-MNIST and achieve very low reconstruction loss, comparable to ANNs. Then, spiking autoencoders are trained to learn meaningful spatio-temporal representations of the data, across the two modalities—audio and visual. We synthesize images from audio in a spike-based environment by first generating, and then utilizing such shared multi-modal spatio-temporal representations. Our audio to image synthesis model is tested on the task of converting TI-46 digits audio samples to MNIST images. We are able to synthesize images with high fidelity and the model achieves competitive performance against ANNs.

Highlights

In recent years, Artificial Neural Networks (ANNs) have become powerful computation tools for complex tasks such as pattern recognition, classification and function estimation problems (LeCun et al, 2015)
We demonstrate that spiking autoencoders can be used to generate reduced-duration spike maps (“hidden state”) of an input spike train, which are a highly compressed version of the input, and they can be utilized across applications
Trying to steer the membrane potentials of all the neurons is extremely hard to optimize, and selectively correcting only incorrectly spiked neurons makes training easier. This could be applicable to any spiking neural network with a large output layer

Summary

Introduction

Artificial Neural Networks (ANNs) have become powerful computation tools for complex tasks such as pattern recognition, classification and function estimation problems (LeCun et al, 2015). They have an “activation” function in their compute unit, know as a neuron. Autoencoders are a class of neural networks that can learn efficient data encodings in an unsupervised manner (Vincent et al, 2008). Their two-layer structure makes them easy to train as well. Multiple autoencoders can be trained separately and stacked

Methods

Results

Conclusion