Abstract

Current techniques for prefetching linked data structures (LDS) exploit the work available in one loop iteration or recursive call to overlap pointer chasing latency. Jump pointers, which provide direct access to non-adjacent nodes, can be used for prefetching when loop and recursive procedure bodies are small and do not have sufficient work to overlap a long latency. This paper describes a framework for jump-pointer prefetching (JPP) that supports four prefetching idioms: queue, full, chain, and root jumping and three implementations: software-only, hardware-only, and a cooperative software/hardware technique. On a suite of pointer intensive programs, jump pointer prefetching reduces memory stall time by 72% for software, 83% for cooperative and 55% for hardware, producing speedups of 15%, 20% and 22% respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call