Abstract

Many contemporary data-intensive applications exhibit irregular and highly concurrent memory access patterns and thus challenge the performance of conventional memory systems. Driven by an expanding need for high-bandwidth memory featuring low access latency, 3D-stacked memory devices, such as the Hybrid Memory Cube (HMC) and High Bandwidth Memory (HBM), were designed to provide significantly higher throughput as compared to standard JEDEC DDR devices. However, existing memory interfaces and coalescing models, designed for conventional DDR devices, are unable to fully exploit the bandwidth potential inherent in these new 3D-stacked memory devices. In order to remedy this disparity, we introduce in this work a novel paged adaptive coalescer (PAC) infrastructure with a scalable coalescing network for 3D-stacked memory. We present the design and simulated implementation of this approach on RISC-V embedded cores with attached HMC devices. We have carried out extensive evaluations and the results show that the proposed PAC methodology yields an average coalescing efficiency of 56.01%. Further, our evaluation results also show that the PAC reduces bank conflicts and the power consumption by 85.16% and 59.21%, respectively. Overall, PAC achieves an average performance gain of 14.35% (and up to 26.06%) across 14 test suites. These results showcase the potential of the PAC methodology as applied to architecture design for increasingly critical data-intensive algorithms and applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call