Abstract

This article presents the design and implementation of an embedded programmable processor with a custom instruction set architecture for efficient realization of artificial neural networks (ANNs). The ANN processor architecture is scalable, supporting an arbitrary number of layers and number of artificial neurons (ANs) per layer. Moreover, the processor supports ANNs with arbitrary interconnect structures among ANs to realize both feed-forward and dynamic recurrent networks. The processor architecture is customizable in which the numerical representation of inputs, outputs, and signals among ANs can be parameterized to an arbitrary fixed-point format. An ASIC implementation of the designed programmable ANN processor for networks with up to 512 ANs and 262,000 interconnects is presented and is estimated to occupy 2.23 mm2 of silicon area and consume 1.25 mW of power from a 1.6 V supply while operating at 74 MHz in a standard 32-nm CMOS technology. In order to assess and compare the efficiency of the designed ANN processor, we have designed and implemented a dedicated reconfigurable hardware architecture for the direct realization of ANNs. Characteristics and implementation results of the designed programmable ANN processor and the dedicated ANN hardware on a Xilinx Artix-7 field-programmable gate array (FPGA) are presented and compared using two benchmarks, the MNIST benchmark using a feed-forward ANN and a movie review sentiment analysis benchmark using a recurrent neural network.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call