Towards a Multi-Level Cache Performance Model for 3D Stencil Computation

Ràul De La Cruz,Mauricio Araya-Polo

doi:10.1016/j.procs.2011.04.235

Ràul De La Cruz, Mauricio Araya-Polo

Open Access

https://doi.org/10.1016/j.procs.2011.04.235

Copy DOI

Export

Save

Cite

Journal: Procedia Computer Science	Publication Date: Jan 1, 2011
Citations: 17	License type: cc-by-nc-nd

Abstract
Full-Text
Similar Papers

Abstract

Listen

It is crucial to optimize stencil computations since they are the core (and most computational demanding segment) of many Scientific Computing applications, therefore reducing overall execution time. This is not a simple task, actually it is lengthy and tedious. It is lengthy because the large number of stencil optimizations combinations to test, which might consume days of computing time, and the process is tedious due to the slightly different versions of code to implement. Alternatively, models that predict performance can be built without any actual stencil execution, thus reducing the cumbersome optimization task. Previous works have proposed cache misses and execution time models for specific stencil optimizations. Furthermore, most of them have been designed for 2D datasets or stencil sizes that only suit low order numerical schemes. We propose a flexible and accurate model for a wide range of stencil sizes up to high order schemes, that captures the behavior of 3D stencil computations using platform parameters. The model has been tested in a group of representative hardware architectures, using realistic dataset sizes. Our model predicts successfully stencil execution times and cache misses. However, predictions accuracy depends on the platform, for instance on x86 architectures prediction errors ranges between 1-20%. Therefore, the model is reliable and can help to speed up the stencil computation optimization process. To that end, other stencil optimization techniques can be added to this model, thus essentially providing a framework which covers most of the state-of-the-art.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Towards a Multi-Level Cache Performance Model for 3D Stencil Computation

Abstract

Published Version

Talk to us

Similar Papers

More From: Procedia Computer Science

Lead the way for us

Similar Papers

Evaluating optimizations that reduce global memory accesses of stencil computations in GPGPUs
Thiago Carrijo Nasciutti ... Jairo Panetta
Concurrency and Computation: Practice and Experience | VOL. 31
Thiago Carrijo Nasciutti, et. al.Thiago Carrijo Nasciutti ... Jairo Panetta
14 Sep 2018
Concurrency and Computation: Practice and Experience | VOL. 31

An Efficient GPU Implementation Technique for Higher-Order 3D Stencils
Omer Anjum ... Wen-Mei Hwu
-
Omer Anjum, et. al.Omer Anjum ... Wen-Mei Hwu
01 Aug 2019
01 Aug 2019

PIMS
Jie Li ... Antonino Tumeo
-
Jie Li, et. al.Jie Li ... Antonino Tumeo
30 Sep 2019
30 Sep 2019

Simple, Accurate, Analytical Time Modeling and Optimal Tile Size Selection for GPGPU Stencils
Nirmal Prajapati ... Rumen Andonov
ACM SIGPLAN Notices | VOL. 52
Nirmal Prajapati, et. al.Nirmal Prajapati ... Rumen Andonov
26 Jan 2017
ACM SIGPLAN Notices | VOL. 52

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Towards a Multi-Level Cache Performance Model for 3D Stencil Computation

Abstract

Published Version

Talk to us

Similar Papers

More From: Procedia Computer Science