DRAM Operation Research Articles

In modern systems, DRAM-based main memory is signi?cantly slower than the processor.Consequently, processors spend a long time waiting to access data from main memory, makingthe long main memory access latency one of the most critical bottlenecks to achieving highsystem performance. Unfortunately, the latency of DRAM has remained almost constant inthe past decade. This is mainly because DRAM has been optimized for cost-per-bit, ratherthan access latency. As a result, DRAM latency is not reducing with technology scaling, andcontinues to be an important performance bottleneck in modern and future systems.This dissertation seeks to achieve low latency DRAM-based memory systems at low costin three major directions. The key idea of these three major directions is to enable and ex-ploit latency heterogeneity in DRAM architecture. First, based on the observation that longbitlines in DRAM are one of the dominant sources of DRAM latency, we propose a newDRAM architecture, Tiered-Latency DRAM (TL-DRAM), which divides the long bitline intotwo shorter segments using an isolation transistor, allowing one segment to be accessed withreduced latency. Second, we propose a ?ne-grained DRAM latency reduction mechanism,Adaptive-Latency DRAM, which optimizes DRAM latency for the common operating conditions for individual DRAM module. We observe that DRAM manufacturers incorporate a very large timing margin as a provision against the worst-case operating conditions, whichis accessing the slowest cell across all DRAM products with the worst latency at the highesttemperature, even though such a slowest cell and such an operating condition are rare. Ourmechanism dynamically optimizes DRAM latency to the current operating condition of theaccessed DRAM module, thereby reliably improving system performance. Third, we observethat cells closer to the peripheral logic can be much faster than cells farther from the peripherallogic (a phenomenon we call architectural variation). Based on this observation, we propose anew technique, Architectural-Variation-Aware DRAM (AVA-DRAM), which reduces DRAMlatency at low cost, by pro?ling and identifying only the inherently slower regions in DRAMto dynamically determine the lowest latency DRAM can operate at without causing failures.This dissertation provides a detailed analysis of DRAM latency by using both circuit-levelsimulation with a detailed DRAM model and FPGA-based pro?ling of real DRAM modules.Our latency analysis shows that our low latency DRAM mechanisms enable significant latencyreductions, leading to large improvement in both system performance and energy e?fficiencyacross a variety of workloads in our evaluated systems, while ensuring reliable DRAM operation.

Read full abstract

Long DRAM latency is a critical performance bottleneck in current systems. DRAM access latency is defined by three fundamental operations that take place within the DRAM cell array: (i) activation of a memory row, which opens the row to perform accesses; (ii) precharge, which prepares the cell array for the next memory access; and (iii) restoration of the row, which restores the values of cells in the row that were destroyed due to activation. There is significant latency variation for each of these operations across the cells of a single DRAM chip due to irregularity in the manufacturing process. As a result, some cells are inherently faster to access, while others are inherently slower. Unfortunately, existing systems do not exploit this variation. The goal of this work is to (i) experimentally characterize and understand the latency variation across cells within a DRAM chip for these three fundamental DRAM operations, and (ii) develop new mechanisms that exploit our understanding of the latency variation to reliably improve performance. To this end, we comprehensively characterize 240 DRAM chips from three major vendors, and make several new observations about latency variation within DRAM. We find that (i) there is large latency variation across the cells for each of the three operations; (ii) variation characteristics exhibit significant spatial locality: slower cells are clustered in certain regions of a DRAM chip; and (iii) the three fundamental operations exhibit different reliability characteristics when the latency of each operation is reduced. Based on our observations, we propose Flexible-LatencY DRAM (FLY-DRAM), a mechanism that exploits latency variation across DRAM cells within a DRAM chip to improve system performance. The key idea of FLY-DRAM is to exploit the spatial locality of slower cells within DRAM, and access the faster DRAM regions with reduced latencies for the fundamental operations. Our evaluations show that FLY-DRAM improves the performance of a wide range of applications by 13.3%, 17.6%, and 19.5%, on average, for each of the three different vendors' real DRAM chips, in a simulated 8-core system. We conclude that the experimental characterization and analysis of latency variation within modern DRAM, provided by this work, can lead to new techniques that improve DRAM and system performance.

Read full abstract

DRAM Operation Research Articles

Related Topics

Articles published on DRAM Operation

Overhang Saddle Fin Sidewall Structure for Highly Reliable DRAM Operation

Silicon Wafer CMP Slurry Using a Hydrolysis Reaction Accelerator with an Amine Functional Group Remarkably Enhances Polishing Rate.

Towards Enhanced System Efficiency while Mitigating Row Hammer

Design of Processing-“Inside”-Memory Optimized for DRAM Behaviors

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity.

Effect of OFF-state stress on reliability of nMOSFET in SWD circuits of DRAM

TWiCe: Time Window Counter Based Row Refresh to Prevent Row-Hammering

In-DRAM Data Initialization

Bank-Group Level Parallelism

Rank-Level Parallelism in DRAM

Understanding Reduced-Voltage Operation in Modern DRAM Devices

Retention and Scalability Perspective of Sub-100-nm Double Gate Tunnel FET DRAM

Refresh-Aware Write Recovery Memory Controller

Improving DRAM Performance in 3-D ICs via Temperature Aware Refresh

DRAF

Understanding Latency Variation in Modern DRAM Chips

Fast Bulk Bitwise AND and OR in DRAM

Methodology for Cycle-Accurate DRAM Performance Analysis

Modeling of Dynamic Operation of T-RAM Cells

Understanding and Analyzing the Impact of Memory Controller's Scheduling Policies on DRAM's Energy and Performance

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

DRAM Operation Research Articles

Related Topics

Articles published on DRAM Operation

Overhang Saddle Fin Sidewall Structure for Highly Reliable DRAM Operation

Silicon Wafer CMP Slurry Using a Hydrolysis Reaction Accelerator with an Amine Functional Group Remarkably Enhances Polishing Rate.

Towards Enhanced System Efficiency while Mitigating Row Hammer

Design of Processing-“Inside”-Memory Optimized for DRAM Behaviors

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity.

Effect of OFF-state stress on reliability of nMOSFET in SWD circuits of DRAM

TWiCe: Time Window Counter Based Row Refresh to Prevent Row-Hammering

In-DRAM Data Initialization

Bank-Group Level Parallelism

Rank-Level Parallelism in DRAM

Understanding Reduced-Voltage Operation in Modern DRAM Devices

Retention and Scalability Perspective of Sub-100-nm Double Gate Tunnel FET DRAM

Refresh-Aware Write Recovery Memory Controller

Improving DRAM Performance in 3-D ICs via Temperature Aware Refresh

DRAF

Understanding Latency Variation in Modern DRAM Chips

Fast Bulk Bitwise AND and OR in DRAM

Methodology for Cycle-Accurate DRAM Performance Analysis

Modeling of Dynamic Operation of T-RAM Cells

Understanding and Analyzing the Impact of Memory Controller's Scheduling Policies on DRAM's Energy and Performance