Abstract

As computer memory increases in size and processors continue to get faster, the memory subsystem becomes a bottleneck to system performance. To mitigate the relatively slow dynamic random access memory (DRAM) chip speeds, a new generation of 3-D stacked DRAM is being developed, with lower power consumption and higher bandwidth. This paper proposes the use of 3-D ring-based data fabrics for fast data transfer between the chips in the 3-D stacked DRAM. The ring-based data fabric uses a fast standing wave oscillator to clock its transactions. With a fast clocking scheme and multiple channels sharing the same bus, more channels are utilized while significantly reducing the number of through-silicon vias. Our memory architecture using a ring-based scheme (MARS) can effectively trade off power, throughput, and latency to improve the system performance for different application spaces. Experimental results show that our ring-based data fabric can reduce read latencies and power consumption. MARS variants can deliver better latency (up to $\sim 4\times $ ), power (up to $\sim 8\times $ ), and performance per watt (up to $\sim 8\times $ ) over high bandwidth memory. We also compare our approach with Wide I/O, which is designed for power-constrained systems. MARS variants provide better latency (up to $\sim 8\times $ ) with similar performance per watt.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.