A reconfigurable field-programmable gate array (FPGA)-based wired-logic deep neural network (DNN) accelerator is presented. High energy efficiency of 16 nJ/classification (Modified National Institute of Standards and Technology: MNIST) is achieved due to the wired-logic architecture. Each neuron in the neural network consists of combinational circuits only, and all the neurons are implemented on an FPGA. Intermediate data are never stored in memory or registers and are transmitted to the output stage by passing through only neuron cells. The latency and power required for memory access can be minimized, enabling low power and high throughput operation. A critical technical issue is reducing hardware resources because all neurons need to be implemented on an FPGA, where hardware resources are limited. Two core technologies have been developed to minimize the required hardware resources: (1) A neural network with a small number of neurons in which weight values of all synapses are fixed to a common value, and (2) a small neuron cell circuit consisting of an adder and look-up table containing an activation function. By fixing all weight values to a certain common value, processing can be simplified from multiply-accumulate operations to just additions, and the hardware resources required for each neuron are minimized. An experiment with the MNIST dataset using a 28-nm FPGA confirmed power consumption of 0.16 W and latency per inference of 100 ns (16nJ/classification). With the same recognition accuracy, the power efficiency is 45.6 times higher than that of the conventional state-of-the-art binarized DNN accelerator with a digital ASIC.
Read full abstract