HPCG on long-vector architectures: Evaluation and optimization on NEC SX-Aurora and RISC-V

Constantino Gómez,Filippo Mantovani,Erich Focht,Marc Casas

doi:10.1016/j.future.2023.01.015

Abstract

Accelerators are becoming a key component to improve efficiency in High-Performance Computing systems (HPC). While GPU based systems are widely used to accelerate HPC workloads, new systems based on long-vector architectures are rapidly gaining popularity. The development of optimized math libraries becomes fundamental to achieve high performance in those emerging vector architectures. This paper focuses on the optimization of the HPCG benchmark, which comprises four fundamental kernels found in many numerical applications. We target two relevant long-vector architectures like the NEC Vector Engine and the RISC-V ’V’ vector extension. Compared to the well-tuned proprietary solution, our open HPCG implementation achieves a 1.6% improvement in performance on the NEC Vector Engine and achieves near maximum memory bandwidth utilization in the two evaluated RISC-V vector accelerator designs.

Full Text