Acceleration with long vector architectures: Implementation and evaluation of the FFT kernel on NEC SX‐Aurora and RISC‐V vector extension

Pablo Vizcaino,Filippo Mantovani,Jesus Labarta,Roger Ferrer

doi:10.1002/cpe.7424

Abstract

SummaryNovel architectures leveraging long and variable vector lengths like the NEC SX‐Aurora or the vector extension of RISCV are appearing as promising solutions on the supercomputing market. These architectures often require re‐coding of scientific kernels. For example, traditional implementations of algorithms for computing the fast Fourier transform (FFT) cannot take full advantage of vector architectures. In this article, we present the implementation of FFT algorithms able to leverage these novel architectures. We evaluate these codes on NEC SX‐Aurora , comparing them with the optimized NEC libraries; and in a prototype of a RISC‐V core with a vector processing unit. We present the benefits and limitations of two approaches of RADIX‐2 FFT vector implementations. We show that our approach makes better use of the vector unit of the NEC SX‐Aurora , reaching higher or equal performance than the optimized NEC library. More generally, we prove the importance of maximizing the vector length usage of the algorithm, taking advantage of the FFT properties to reduce long‐latency vector operations, and reordering the instructions according to the specific hardware features to boost the performance of FFT‐like computational kernels.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Concurrency and Computation: Practice and Experience	Publication Date: Nov 2, 2022
Citations: 4	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

Acceleration with long vector architectures: Implementation and evaluation of the FFT kernel on NEC SX‐Aurora and RISC‐V vector extension

Abstract

Talk to us

Similar Papers

More From: Concurrency and Computation: Practice and Experience

Lead the way for us

Similar Papers

Supporting matrix operations in vector architectures
H Bi ... W.K Giloi
-
H Bi, et. al.H Bi ... W.K Giloi
01 Mar 1992
01 Mar 1992

An Efficient Shuffle-Light FFT Library
Salvatore Servodio ... Xiaoming Li
-
Salvatore Servodio, et. al.Salvatore Servodio ... Xiaoming Li
29 Oct 2021
29 Oct 2021

The new algorithms for 2-dimensional FFT with prime size
H Liu ... R Tolimieri
-
H Liu, et. al.H Liu ... R Tolimieri
01 Jan 1991
01 Jan 1991

A reconfigurable ASIP for high-throughput and flexible FFT processing in SDR environment
Hengzhu Liu ... Botao Zhang
-
Hengzhu Liu, et. al.Hengzhu Liu ... Botao Zhang
16 Apr 2014
16 Apr 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Acceleration with long vector architectures: Implementation and evaluation of the FFT kernel on NEC SX‐Aurora and RISC‐V vector extension

Abstract

Talk to us

Similar Papers

More From: Concurrency and Computation: Practice and Experience