A systematic approach to improving data locality across Fourier transforms and linear algebra operations

Doru Thom Popovici,Andrew Canning,John Shalf,Lin-Wang Wang,Zhengji Zhao

doi:10.1145/3447818.3460354

Abstract

The performance of most scientific applications depends on efficient mathematical libraries. For example, scientific applications like the plane wave based Density Functional Theory approach for electronic structure calculations uses highly optimized libraries for Fourier transforms, dense linear algebra (orthogonalization) and sparse linear algebra (non-local projectors in real space). Although vendor-tuned libraries offer efficient implementations for each standalone mathematical kernel, the partitioning of those calls into sequentially invoked kernels inhibits cross-kernel optimizations that could improve data locality across memory bound operations. In this work we show that, by expressing these kernels as an operation on high dimensional tensors, cross-kernel dataflow optimizations that span FFT, dense and sparse linear algebra, can be readily exposed and exploited. We outline a systematic way of merging the Fourier transforms with the linear algebra computations, improving data locality and reducing data movement to main memory. We show that compared to conventional implementations, this streaming/dataflow approach offers 2x speedup on GPUs and 8x/12x speedup on CPUs compared to a baseline code that uses vendor-optimized libraries. Although we use Density Functional Theory to demonstrate the value of our approach, our methodology is broadly applicable to other applications that use Fourier transforms and linear algebra operations as building blocks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A systematic approach to improving data locality across Fourier transforms and linear algebra operations

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

On the Use of BLAS Libraries in Modern Scientific Codes at Scale
Harry Waugh ... Simon Mcintosh-Smith
-
Harry Waugh, et. al.Harry Waugh ... Simon Mcintosh-Smith
01 Jan 2020
01 Jan 2020

K-Means Clustering on Two-Level Memory Systems
Michael A Bender ... Simon D Hammond
-
Michael A Bender, et. al.Michael A Bender ... Simon D Hammond
05 Oct 2015
05 Oct 2015

Many-core graph analytics using accelerated sparse linear algebra routines
Aaron L Paolini ... Eric Kelmelis
-
Aaron L Paolini, et. al.Aaron L Paolini ... Eric Kelmelis
12 May 2016
12 May 2016

Abstract: Matrices Over Runtime Systems at Exascale
Emmanuel Agullo ...
-
Emmanuel Agullo, et. al.Emmanuel Agullo ...
01 Nov 2012
01 Nov 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A systematic approach to improving data locality across Fourier transforms and linear algebra operations

Abstract

Talk to us

Similar Papers