Caching values in the load store queue

D Nicolaescu,A Veidenbaum,A Nicolau

doi:10.1109/mascot.2004.1348315

Abstract

The latency of an L1 data cache continues to grow with increasing clock frequency, cache size and associativity. The increased latency is an important source of performance loss in high-performance processors. The paper proposes to cache data utilizing the load store queue (LSQ) hardware and data paths. Using very little additional hardware, this inexpensive cache improves performance and reduces energy consumption. The modified load store queue "caches" all previously accessed data values going beyond existing store-to-load forwarding techniques. Both load and store data are placed in the LSQ and are retained there after a corresponding memory access instruction has been committed. It is shown that a 128-entry modified LSQ design allows an average of 51% of all loads in the SPECint2000 benchmarks to get their data from the LSQ. Up to 7% performance improvement is achieved on SPECint2000 with a 1-cycle LSQ access latency and 3-cycle L1 cache latency. The average speedup is over 4%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Caching values in the load store queue

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

LSQ: a power efficient and scalable implementation
F Castro ... L Pinuel
IEE Proceedings - Computers and Digital Techniques | VOL. 153
F Castro, et. al.F Castro ... L Pinuel
01 Jan 2006
IEE Proceedings - Computers and Digital Techniques | VOL. 153

Hybrid timing-address oriented load-store queue filtering for an x86 architecture
R Apolloni ... M Prieto
IET Computers & Digital Techniques | VOL. 5
R Apolloni, et. al.R Apolloni ... M Prieto
01 Mar 2011
IET Computers & Digital Techniques | VOL. 5

SOLE: Speculative one-cycle load execution with scalability, high-performance and energy-efficiency
Zhenhao Zhang ... Xiaoyin Wang
-
Zhenhao Zhang, et. al.Zhenhao Zhang ... Xiaoyin Wang
01 Sep 2012
01 Sep 2012

Software-Hardware Cooperative Memory Disambiguation
R Huang ... A Garg
-
R Huang, et. al.R Huang ... A Garg
27 Feb 2006
27 Feb 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Caching values in the load store queue

Abstract

Talk to us

Similar Papers