Abstract

High-bandwidth memory (HBM) offers breakthrough memory bandwidth through its vertically stacked memory architecture and through-silicon via (TSV)-based fast interconnect. However, the stacked architecture leads to high-power density causing thermal issues when running modern memory-hungry workloads such as deep neural networks (DNNs). Prior works on dynamic thermal management (DTM) of 3-D DRAM do not consider the physical structure of HBM and often lead to heavy DTM-induced performance penalty. We propose an application-aware efficient task mapping and migration-based DTM policy that maps DNN instances to cores through exploiting the channel layout of HBM and leveraging the significant temperature gradient across DRAM dies while making thermal decisions. We utilize the variation in the memory access behavior of DNN layers and attempt to minimize stalling due to thermal hotspots in the HBM stack. We also use application-aware dynamic voltage and frequency scaling (DVFS) and DRAM low-power states to further improve performance. Experimental results on workloads comprising seven popular DNNs show that NeuroMap results in an average execution time and memory energy reduction of 39% and 40%, respectively, over state-of-the-art DTM mechanisms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.