Scaling for edge inference of deep neural networks

Xiaowei Xu,Yu Hu,Sharon Xiaobo Hu,Yukun Ding,Jason Cong,Yiyu Shi,Michael Niemier

doi:10.1038/s41928-018-0059-3

Abstract

Deep neural networks offer considerable potential across a range of applications, from advanced manufacturing to autonomous cars. A clear trend in deep neural networks is the exponential growth of network size and the associated increases in computational complexity and memory consumption. However, the performance and energy efficiency of edge inference, in which the inference (the application of a trained network to new data) is performed locally on embedded platforms that have limited area and power budget, is bounded by technology scaling. Here we analyse recent data and show that there are increasing gaps between the computational complexity and energy efficiency required by data scientists and the hardware capacity made available by hardware architects. We then discuss various architecture and algorithm innovations that could help to bridge the gaps. This Perspective highlights the existence of gaps between the computational complexity and energy efficiency required for the continued scaling of deep neural networks and the hardware capacity actually available with current CMOS technology scaling, in situations where edge inference is required; it then discusses various architecture and algorithm innovations that could help to bridge these gaps.

Full Text