Abstract

Monolithic 3D (MONO3D) integration provides performance and power efficiency benefits over 2D circuits and, thus, is a potent technology for the design of Deep Neural Network (DNN) accelerators with enhanced energy efficiency. However, high IC temperatures are major challenges for the design of MONO3D systems. To this end, this paper focuses on designing temperature-aware MONO3D DNN accelerators. We propose a new automated method, called TREAD-M3D, that provides a near-optimal MONO3D DNN accelerator architecture in terms of systolic array size, SRAM organization, partition across 3D layers, and operating frequency, for a given DNN, optimization goal, and temperature constraint. TREAD-M3D incorporates circuit-and architecture-level models to evaluate the power and performance characteristics of different partitions. Our method reveals valuable insights and enables tradeoff analysis for achieving high energy efficiency in MONO3D systolic arrays. In comparison to recent works that adopt a fixed partition choice to design MONO3D DNN systems, TREAD-M3D yields up to 22% higher energy efficiency. Using TREAD-M3D, we further demonstrate that temperature unawareness not only leads to infeasible configurations due to temperature violations but also over-estimates energy-delay-product benefits by up to 24%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call