Abstract

Emerging High-Performance Computing (HPC) workloads, such as graph analytics, machine learning, big data science, are data-intensive. The data-intensive workloads usually present irregular memory footprints with limited data locality, and thus incur frequent cache misses and a growing desire for memory bandwidth. Driven by this need, 3D-stacked memory devices such as Hybrid Memory Cube (HMC) and High Bandwidth Memory (HBM) are introduced to yield significantly higher throughput. However, the traditional interfaces and optimization methods for JEDEC DDR devices cannot fully exploit the potential performance of 3D-stacked memory to handle massive irregular memory accesses accompanied with data-intensive applications. In this paper, we propose a novel Hotspot-Aware Manager (HAM) infrastructure for 3D-stacked memory devices that is capable of optimizing memory access streams via request aggregation, hotspot detection, prefetching, and an associated hotspot-aware page policy. We present the HAM design and simulation implementation on RISC-V embedded cores with attached HMC devices. We have conducted extensive evaluations with over 12 benchmarks and applications representing diverse irregular memory access patterns. The results reveal that HAM reduces redundant memory accesses by 37.51% and achieves a 4.19X enhancement on the prefetch buffer hit rate on average. Overall, HAM exhibits an average of 21.81% performance gain (up to 34.28%) and 35.07% power saving over the standard 3D-stacked memory.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call