Abstract

Translation lookaside buffer (TLB) was recently introduced into modern graphics processing unit (GPU) architectures to support virtual memory addressing. Compared to CPUs, the performance of GPUs is more sensitive to the capacity of TLBs because of heavier memory accesses. However, large SRAM cell area greatly limits the implementable capacity of conventional SRAM-based TLBs. In this work, we propose using STT-RAM to construct TLBs in light of the unique memory access pattern in GPUs, i.e., infrequent data updates. STT-RAM TLB can replace its same-area SRAM counterpart with greater capacity, similar read performance and lower energy consumption. As an optimization of STT-RAM TLB, we further propose a STT-RAM-based dynamically-configurable TLB (STD-TLB) by leveraging differential sensing technique. STD-TLB can switch between high-capacity mode and high-performance mode on-the-fly based on real-time application needs. Our experiments show that compared to SRAM TLB, standard STT-RAM TLB improves the performance and energy delay product of GPU address translation by 32% and 75%, respectively, while STD-TLB achieves additional 15% and 13% improvements over standard STT-RAM TLB.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.