Scalable Load and Store Processing in Latency Tolerant Processors

Amit Gandhi,Haitham Akkary,Konrad Lai,Srikanth T Srinivasan,Ravi Rajwar

doi:10.1145/1080695.1070007

Abstract

Memory latency tolerant architectures support thousands of in-flight instructions without scaling cycle-critical processor resources, and thousands of useful instructions can complete in parallel with a miss to memory. These architectures however require large queues to track all loads and stores executed while a miss is pending. Hierarchical designs alleviate cycle time impact of these structures but the CAM and search functions required to enforce memory ordering and provide data forwarding place high demand on area and power. We present new load-store processing algorithms for latency tolerant architectures. We augment primary load and store queues with secondary buffers. The secondary load buffer is a set associative structure, similar to a cache. The secondary store buffer, the Store Redo Log, is a first-in first-out structure recording the program order of all stores completed in parallel with a miss, and has no CAM and search functions. Instead of the secondary store queue, a cache provides temporary forwarding. The SRL enforces memory ordering by ensuring memory updates occur in program order once the miss returns. The new algorithms eliminate the CAM and search functions in the secondary load and store buffers, and remove fundamental sources of complexity, power, and area inefficiency in load/store processing. The new organization, while being area and power efficient, is competitive in performance compared to hierarchical designs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Scalable Load and Store Processing in Latency Tolerant Processors

Abstract

Talk to us

Similar Papers

More From: ACM SIGARCH Computer Architecture News

Lead the way for us

Journal: ACM SIGARCH Computer Architecture News	Publication Date: May 1, 2005
Citations: 37

Similar Papers

Scalable Load and Store Processing in latency tolerant processors
Amit Gandhi
-
Amit GandhiAmit Gandhi
20 Jan 2023
20 Jan 2023

Scalable Load and Store Processing in Latency-Tolerant Processors
A Gandhi ... R Rajwar
IEEE Micro | VOL. 26
A Gandhi, et. al.A Gandhi ... R Rajwar
01 Jan 2006
IEEE Micro | VOL. 26

The Superfluous Load Queue
Alberto Ros ... Stefanos Kaxiras
-
Alberto Ros, et. al.Alberto Ros ... Stefanos Kaxiras
01 Jan 2018
01 Jan 2018

Kilo-instruction Processors
Adrián Cristal ... Josep Llosa
-
Adrián Cristal, et. al.Adrián Cristal ... Josep Llosa
01 Jan 2003
01 Jan 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Scalable Load and Store Processing in Latency Tolerant Processors

Abstract

Talk to us

Similar Papers

More From: ACM SIGARCH Computer Architecture News