Abstract

Emerging software stacks continue to process ever-increasing amounts of data, posing a performance challenge to the virtual memory layer of modern computer systems. In particular, address translation is now an acute system performance bottleneck. In response, we propose a class of cache prefetchers triggered by page table walk (PTW) activity. Our scheme-translation-enabled memory prefetching optimizations (TEMPO)-hinges on two observations. First, a substantial fraction of DRAM references in modern big-data workloads are devoted to accessing page tables (PTs). Second, when memory references require PT lookups in DRAM, the majority of them also look up DRAM for the subsequent data access. TEMPO exploits these observations to enable cache prefetching of the data pointed to by the PT. TEMPO requires only modest changes to hardware and no OS or application-level changes. Overall, TEMPO improves performance by 10-30 percent and energy by 1-14 percent.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call