Abstract
This paper presents the results of our analysis of the main problems that have to be solved in the design of highly parallel high-performance accelerators for Deep Neural Networks (DNNs) used in low power Cyber–Physical System (CPS) and Internet of Things (IoT) devices, in application areas such as smart automotive, health and smart services in social networks (Facebook, Instagram, X/Twitter, etc.). Our analysis demonstrates that to arrive a to high-quality DNN accelerator architecture, complex mutual trade-offs have to be resolved among the accelerator micro- and macro-architecture, and the corresponding memory and communication architectures, as well as among the performance, power consumption and area. Therefore, we developed a multi-processor accelerator design methodology involving an automatic design-space exploration (DSE) framework that enables a very efficient construction and analysis of DNN accelerator architectures, as well as an adequate trade-off exploitation. To satisfy the low power demands of IoT devices, we extend our quality-driven model-based multi-processor accelerator design methodology with some novel power optimization techniques at the Processor’s and memory exploration stages. Our proposed power optimization techniques at the processor’s exploration stage achieve up to 66.5% reduction in power consumption, while our proposed data reuse techniques avoid up to 85.92% of redundant memory accesses thereby reducing the power consumption of accelerator necessary for low-power IoT applications. Currently, we are beginning to apply this methodology with the proposed power optimization techniques to the design of low-power DNN accelerators for IoT applications.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have