In modern systems, DRAM-based main memory is signi?cantly slower than the processor.Consequently, processors spend a long time waiting to access data from main memory, makingthe long main memory access latency one of the most critical bottlenecks to achieving highsystem performance. Unfortunately, the latency of DRAM has remained almost constant inthe past decade. This is mainly because DRAM has been optimized for cost-per-bit, ratherthan access latency. As a result, DRAM latency is not reducing with technology scaling, andcontinues to be an important performance bottleneck in modern and future systems.This dissertation seeks to achieve low latency DRAM-based memory systems at low costin three major directions. The key idea of these three major directions is to enable and ex-ploit latency heterogeneity in DRAM architecture. First, based on the observation that longbitlines in DRAM are one of the dominant sources of DRAM latency, we propose a newDRAM architecture, Tiered-Latency DRAM (TL-DRAM), which divides the long bitline intotwo shorter segments using an isolation transistor, allowing one segment to be accessed withreduced latency. Second, we propose a ?ne-grained DRAM latency reduction mechanism,Adaptive-Latency DRAM, which optimizes DRAM latency for the common operating conditions for individual DRAM module. We observe that DRAM manufacturers incorporate a very large timing margin as a provision against the worst-case operating conditions, whichis accessing the slowest cell across all DRAM products with the worst latency at the highesttemperature, even though such a slowest cell and such an operating condition are rare. Ourmechanism dynamically optimizes DRAM latency to the current operating condition of theaccessed DRAM module, thereby reliably improving system performance. Third, we observethat cells closer to the peripheral logic can be much faster than cells farther from the peripherallogic (a phenomenon we call architectural variation). Based on this observation, we propose anew technique, Architectural-Variation-Aware DRAM (AVA-DRAM), which reduces DRAMlatency at low cost, by pro?ling and identifying only the inherently slower regions in DRAMto dynamically determine the lowest latency DRAM can operate at without causing failures.This dissertation provides a detailed analysis of DRAM latency by using both circuit-levelsimulation with a detailed DRAM model and FPGA-based pro?ling of real DRAM modules.Our latency analysis shows that our low latency DRAM mechanisms enable significant latencyreductions, leading to large improvement in both system performance and energy e?fficiencyacross a variety of workloads in our evaluated systems, while ensuring reliable DRAM operation.
Read full abstract