Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures

Tom Henretty,Franz Franchetti,Louis-Noël Pouchet,J Ramanujam,Kevin Stock,P Sadayappan

doi:10.1007/978-3-642-19861-8_13

Tom Henretty, Franz Franchetti + Show 4 more

Open Access

https://doi.org/10.1007/978-3-642-19861-8_13

Copy DOI

Abstract

Stencil computations are at the core of applications in many domains such as computational electromagnetics, image processing, and partial differential equation solvers used in a variety of scientific and engineering applications. Short-vector SIMD instruction sets such as SSE and VMX provide a promising and widely available avenue for enhancing performance on modern processors. However a fundamental memory stream alignment issue limits achieved performance with stencil computations on modern short SIMD architectures. In this paper, we propose a novel data layout transformation that avoids the stream alignment conflict, along with a static analysis technique for determining where this transformation is applicable. Significant performance increases are demonstrated for a variety of stencil codes on three modern SIMD-capable processors.KeywordsSingle PrecisionData LayoutAccess FunctionReuse DistanceInnermost LoopThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

PIMS
Jie Li ... John D Leidel
-
Jie Li, et. al.Jie Li ... John D Leidel
30 Sep 2019
30 Sep 2019

Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors
Mingzhen Li ... Xin You
-
Mingzhen Li, et. al.Mingzhen Li ... Xin You
09 Aug 2021
09 Aug 2021

Chapter 14 - Large-Scale Gas Turbine Simulations on GPU Clusters
Tobias Brandvik ... Graham Pullan
GPU Computing Gems Jade Edition | VOL. -
Tobias Brandvik, et. al.Tobias Brandvik ... Graham Pullan
30 Nov 2011
GPU Computing Gems Jade Edition | VOL. -

A stencil compiler for short-vector SIMD architectures
Tom Henretty ... P Sadayappan
-
Tom Henretty, et. al.Tom Henretty ... P Sadayappan
10 Jun 2013
10 Jun 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures

Abstract

Talk to us

Similar Papers