Computation reuse-aware accelerator for neural networks

Hoda Mahdiani ,Alireza Khadem ,Ali Yasoubi ,Azam Ghanbari ,Mehdi Modarressi ,Masoud Daneshtalab

doi:10.1049/pbcs055e_ch7

Abstract

Power consumption has long been a significant concern in neural networks. In particular, large neural networks that implement novel machine learning techniques require much more computation, and hence power, than ever before. In this chapter, we showed that computation reuse could exploit the inherent redundancy in the arithmetic operations of the neural network to save power. Experimental results showed that computation reuse, when coupled with the approximation property of neural networks, can eliminate up to 90% of multiplication, effectively reducing power consumption by 61%, on average in the presented architecture. The proposed computation reuse -aware design can be extended in several ways. First, it can be integrated into several state-of-the-art customized architectures for LSTM, spiking , and convolutional neural network models to further reduce power consumption. Second, we can couple computation reuse with existing mapping and scheduling algorithms toward developing reusable scheduling and mapping methods for neural network. Computation reuse can also boost the performance of the methods that eliminate ineffectual computations in deep learning neural networks . Evaluating the impact of CORN on reliability and customizing the CORN architecture for FPGA-based neural network implementation are the other future works in this line.

Full Text