Data cache prefetching design space exploration for BlueGene/L supercomputer

J.R Brunheroto,D Hoenicke,V Salapura,A Gara,F.F Redigolo

doi:10.1109/cahpc.2005.23

Abstract

Scientific applications exhibit good spatial and temporal data memory access locality. It is possible to hide memory latency for the level 3 cache, and reduce contention between multiple cores sharing a single level 3 cache, by using a prefetch cache to identify data streams which can be profitably prefetched, and decouple the cache line size mismatch between L3 cache and the level 1 data cache. In this work, a design space exploration is presented, which helped shape the design of the BlueGene/L supercomputer memory sub-system. The prefetch cache consists of a small number of 128 line buffers that speculatively prefetches data from the L3 cache, since applications present some sequential access pattern, this prefetching scheme increases the likelihood that a request from the level 1 data cache was present in the prefetch cache. Since most compute intensive applications contain a small number of data streams, it is sufficient for the prefetch cache to have small number of line buffers to track and detect the data streams. This paper focuses on the evaluation of stream detection mechanisms and the influence of varying the replacement policies for stream prefetch caches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data cache prefetching design space exploration for BlueGene/L supercomputer

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

STM: Cloning the spatial and temporal memory access behavior
Amro Awad ... Yan Solihin
-
Amro Awad, et. al.Amro Awad ... Yan Solihin
01 Feb 2014
01 Feb 2014

Effectiveness of register preloading on CP-PACS node processor
H Nakamura ... M Matsubara
-
H Nakamura, et. al.H Nakamura ... M Matsubara
22 Oct 1997
22 Oct 1997

WOW: wise ordering for writes - combining spatial and temporal locality in non-volatile caches
...
-
, et. al. ...
13 Dec 2005
13 Dec 2005

Effective Instruction Prefetching in Chip Multiprocessors for Modern Commercial Applications
L Spracklen ... S.G Abraham
-
L Spracklen, et. al.L Spracklen ... S.G Abraham
12 Feb 2005
12 Feb 2005

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data cache prefetching design space exploration for BlueGene/L supercomputer

Abstract

Talk to us

Similar Papers