A High-Speed NTT-Based Polynomial Multiplication Accelerator with Vector Extension of RISC-V for Saber Algorithm

Honglin Kuang,Yifan Zhao,Jun Han

doi:10.1109/apccas55924.2022.10090293

Abstract

Saber is a module-learning with rounding-based post-quantum cryptography (PQC) scheme for key encapsulation mechanism (KEM). It is characterized by the use of power-of-two moduli, which makes all modulus reductions free in hardware. However, such a decision prevents the direct implementation of the asymptotically fastest number theoretic transform (NTT) for the time-consuming polynomial multiplication in Saber. To efficiently multiply polynomials, researches have been done using a schoolbook or Toom-Cook or Karatsuba algorithm. Though these approaches result in decent operating speed at moderate area cost, they are disadvantageous when considering expanding the system to support multiple PQC protocols. To enable NTT for Saber, we choose an appropriate prime and use the sign-magnitude format for computation. A concise and efficient vectorized NTT algorithm has been proposed, based on which we design a configurable vector NTT unit to perform NTT and other arithmetic operations. The accelerator is dedicatedly pipelined to achieve high speed and is driven by custom vector instruction extension of RISC-V. We implement the proposed architecture with vector lanes of 32 and 16 on Xilinx UltraScale+ ZCU111. Results show that our design can achieve up to <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$5\mathrm{x}$</tex> and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$3\mathrm{x}$</tex> improvement in computation time and area-time-product (ATP) respectively for degree-256 polynomials multiplication, compared to the state-of-the-art Saber polynomial multiplier counterparts.

Full Text