Abstract

Neuromorphic computing chips consisting of crossbar arrays of emergent nonvolatile memory (NVM) have the potential of achieving both high energy efficiency and throughput as the low-power implementation of convolutional neural network (CNN) inference engines. However, such hardware has design constraints, such as its limited fan-in/fan-out and resource-inefficient mapping, that make the design and deployment of CNN on them challenging. As a result, the user has to design the CNN model with intricate knowledge of the hardware architecture and even cannot fit the models in the hardware for CNN with high resolution image input. In this article, we propose the use of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">aggregated subnets</i> , NC-net, which is a constrained form of the traditional layer structure, to solve these issues. With our method, we put forward an energy-efficient buffer- and analogue-to-digital converter and digital-to-analogue converter (ADC/DAC)-free architecture and a scalable end-to-end solution that automatically satisfies the hardware constraints of crossbar architectures, while optimizing the resource usage. In our solution, the exploration and deployment of a CNN for a neuromorphic crossbar hardware start with a design front end based on <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TensorFlow</i> . Our automated design flow maps the NC-net network from <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TensorFlow</i> to the crossbar architecture. We tested our designs on both a simulator and a field-programmable gate array (FPGA) emulator with various benchmarks. In addition to general benchmarks, including MNIST, SVHN, CIFAR-10, and CIFAR-100, we tested our system on a real-world application, human detection with high resolution (224 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\times $ </tex-math></inline-formula> 224) images as the input. Our system achieves the state-of-the-art accuracy for these benchmarks on the crossbar-based neuromorphic hardware, with an accuracy of more than 90% for the latter. It also yielded up to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$4.25\times $ </tex-math></inline-formula> improvement in the efficiency of spiking core usage compared to TrueNorth.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.