Abstract

Crossbar-based neuromorphic computing to accelerate neural networks is a popular alternative to conventional von Neumann computing systems. It is also referred as processing-in-memory and in-situ analog computing. The crossbars have a fixed number of synapses per neuron and it is necessary to decompose neurons to map networks onto the crossbars. This paper proposes the k-spare decomposition algorithm that can trade off the predictive performance against the neuron usage during the mapping. The proposed algorithm performs a two-level hierarchical decomposition. In the first global decomposition, it decomposes the neural network such that each crossbar has k spare neurons. These neurons are used to improve the accuracy of the partially mapped network in the subsequent local decomposition. Our experimental results using modern convolutional neural networks show that the proposed method can improve the accuracy substantially within about 10% extra neurons.

Highlights

  • Deep learning has demonstrated astonishing performance in various fields such as computer vision, natural language processing, games, etc., over the past several years [1,2,3], and there is no doubt that deep models will play a critical role in the machine intelligence in the future

  • In order to address this challenge, much research effort is devoted to develop domain-specific computing systems to deep learning using ASICs and FPGAs [6,7]. Most of these systems are built upon the von Neumann architecture where instructions, implicit or explicit, and data are stored in memories separated from processors

  • The original networks are mapped onto p × q crossbars

Read more

Summary

Introduction

Deep learning has demonstrated astonishing performance in various fields such as computer vision, natural language processing, games, etc., over the past several years [1,2,3], and there is no doubt that deep models will play a critical role in the machine intelligence in the future. In order to address this challenge, much research effort is devoted to develop domain-specific computing systems to deep learning using ASICs and FPGAs [6,7] Most of these systems are built upon the von Neumann architecture where instructions, implicit or explicit, and data are stored in memories separated from processors. Electronics 2020, 9, 1526 neuromorphic systems arranges neuron-like processing elements in a grid and each of them stores a fixed number of weights. This type is referred as the matrix-based neuromorphic system [12]. Given a trained neural network with high precision weights, the weights should be quantized because neuromorphic systems usually employ low-precision synapses to integrate a large number of synapses onto a chip [16] This process incurs quantization error, leading to accuracy loss.

Problem Formulation
Decomposition Algorithms
Global Decomposition
Local Decomposition Methods
Candidate Selection
Experimental Results
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.