Abstract

AbstractDeep neural networks (DNNs) have become the most important and popular machine learning technique in the emerging artificial intelligence era. Because of their inherent large‐scale sizes, DNN models are both computation intensive and storage intensive, thereby posing huge challenges for efficient deployment. To overcome this problem, a promising solution is to build customized hardware accelerators to improve the processing speed and energy efficiency when executing DNNs. However, the architecture design of specialized DNN accelerator is nontrivial, given the massive amount of data movements, the rapid development of the DNN algorithms and models, the high demand of reconfigurability and programmability, and the strict requirement of preserving accuracy.To date, many different types of design solutions, varying on device, circuit, architecture, and algorithm levels, have been proposed and implemented in recent years. This article focuses on the review of digital CMOS‐based DNN hardware architecture. By analyzing the design requirements and challenges of DNN accelerators within the classical von Neumann framework, we introduce the basic underlying hardware architecture and computation mapping strategy. Based on that, the advanced optimization techniques are also described. The open problems and challenges for the future DNN hardware architecture are also analyzed and elaborated.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.