Deep neural networks (DNNs) have surpassed other algorithms in analyzing today's abundant data. Due to the security and delay requirements of the given applications, analytical data extraction should happen on edge devices. Edge devices, however, struggle to support the increasingly complex DNNs because of the models' high computational load and number of employed parameters. Thus, edge devices need support by efficient accelerators to process DNNs. However, the design of DNN accelerators remains challenging, as there is a lack of established design techniques directed towards a specific design point in terms of energy budget, area, time-to-solution, and classification accuracy. This article fills this gap by providing a quantitative, large-scale, state-of-the-art comparison of DNN accelerators building on published data subjected to technology scaling and benchmark normalization. The leveled comparison stimulates learning from previous designs by considering the impact of each technique on energy, area, and time-to-solution. Furthermore, the key design techniques used in the DNN accelerators are classified according to their influence on the classification accuracy. Finally, we provide a discussion of hardware accelerators to support future designers in considering the trade-off between efficiency and accuracy in order to identify the most suitable techniques for certain benchmarks.
Read full abstract