Abstract
The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM technology is experiencing difficult technology scaling challenges that make the maintenance and enhancement of its capacity, energy-efficiency, and reliability significantly more costly with conventional techniques. In this paper, after describing the demands and challenges faced by the memory system, we examine some promising research and design directions to overcome challenges posed by memory scaling. Specifically, we survey three key solution directions: 1) enabling new DRAM architectures, functions, interfaces, and better integration of the DRAM and the rest of the system, 2) designing a memory system that employs emerging memory technologies and takes advantage of multiple different technologies, 3) providing predictable performance and QoS to applications sharing the memory system. We also briefly describe our ongoing related work in combating scaling challenges of NAND flash memory.
Highlights
Main memory is a critical component of all computing systems, whether they be server, embedded, desktop, mobile, sensor
Some emerging resistive memory technologies, such as phase change memory (PCM) [64, 71, 37, 38, 63] or spintransfer torque magnetic memory (STT-MRAM) [13, 35] appear more scalable, have latency and bandwidth characteristics much closer to DRAM than flash memory and hard disks, and are non-volatile with little idle power consumption. Such emerging technologies can enable new opportunities in system design, including, for example, the unification of memory and storage subsystems. They have the potential to be employed as part of main memory, alongside or in place of less scalable and leaky DRAM, but they have various shortcomings depending on the technology that need to be overcome
We have recently shown that we can architect a heterogeneous-latency bitline DRAM, called Tiered-Latency DRAM (TL-DRAM) [41], by dividing a long bitline into two shorter segments using an isolation transistor: a low-latency segment can be accessed with the latency and efficiency of a shortbitline DRAM while the high-latency segment enables high density, thereby reducing cost-per-bit (The additional area overhead of TLDRAM is approximately 3% over commodity DRAM)
Summary
Main memory is a critical component of all computing systems, whether they be server, embedded, desktop, mobile, sensor. Energy, cost, performance, and management algorithms must scale as we scale the size of the computing system in order to maintain performance growth and enable new applications. Such scaling has become difficult because recent trends in systems, applications, and technology exacerbate the memory system bottleneck
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have