Impact of modern memory subsystems on cache optimizations for stencil computations

Shoaib Kamil,Parry Husbands,Katherine Yelick,Leonid Oliker,John Shalf

doi:10.1145/1111583.1111589

Abstract

In this work we investigate the impact of evolving memory system features, such as large on-chip caches, automatic prefetch, and the growing distance to main memory on 3D stencil computations. These calculations form the basis for a wide range of scientific applications from simple Jacobi iterations to complex multigrid and block structured adaptive PDE solvers. First we develop a simple benchmark to evaluate the effectiveness of prefetching in cache-based memory systems. Next we present a small parameterized probe and validate its use as a proxy for general stencil computations on three modern microprocessors. We then derive an analytical memory cost model for quantifying cache-blocking behavior and demonstrate its effectiveness in predicting the stencil-computation performance. Overall results demonstrate that recent trends memory system organization have reduced the efficacy of traditional cache-blocking optimizations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Impact of modern memory subsystems on cache optimizations for stencil computations

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

PIMS
Jie Li ... John D Leidel
-
Jie Li, et. al.Jie Li ... John D Leidel
30 Sep 2019
30 Sep 2019

On the GPU Performance of 3D Stencil Computations Implemented in OpenCL
Huayou Su ... Nan Wu
-
Huayou Su, et. al.Huayou Su ... Nan Wu
01 Jan 2013
01 Jan 2013

Evaluating optimizations that reduce global memory accesses of stencil computations in GPGPUs
Thiago Carrijo Nasciutti ... Jairo Panetta
Concurrency and Computation: Practice and Experience | VOL. 31
Thiago Carrijo Nasciutti, et. al.Thiago Carrijo Nasciutti ... Jairo Panetta
14 Sep 2018
Concurrency and Computation: Practice and Experience | VOL. 31

On the GPU-CPU Performance Portability of OpenCL for 3D Stencil Computations
Huayou Su ... Xing Cai
-
Huayou Su, et. al.Huayou Su ... Xing Cai
01 Dec 2013
01 Dec 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Impact of modern memory subsystems on cache optimizations for stencil computations

Abstract

Talk to us

Similar Papers