Abstract
We present the first Cortex-M4 implementation of the NISTPQC signature finalist Rainbow. We target the Giant Gecko EFM32GG11B which comes with 512 kB of RAM which can easily accommodate the keys of RainbowI.We present fast constant-time bitsliced F16 multiplication allowing multiplication of 32 field elements in 32 clock cycles. Additionally, we introduce a new way of computing the public map P in the verification procedure allowing vastly faster signature verification.Both the signing and verification procedures of our implementation are by far the fastest among the NISTPQC signature finalists. Signing of rainbowIclassic requires roughly 957 000 clock cycles which is 4× faster than the state of the art Dilithium2 implementation and 45× faster than Falcon-512. Verification needs about 239 000 cycles which is 5× and 2× faster respectively. The cost of signing can be further decreased by 20% when storing the secret key in a bitsliced representation.
Highlights
The advance of large scale quantum computers is threatening all conventional public-key cryptography currently deployed due to Shor’s algorithm [Sho94]
We present optimized Cortex-M4 implementations of all three third-round instances of Rainbow with the parameter set aiming at National Institute of Standards and Technology (NIST) security level 1
Rainbow was proposed by Ding and Schmidt in 2004 [DS05], with a multi-stage Unbalanced Oil and Vinegar (UOV) structure
Summary
The advance of large scale quantum computers is threatening all conventional public-key cryptography currently deployed due to Shor’s algorithm [Sho94]. We present optimized Cortex-M4 implementations of all three third-round instances of Rainbow with the parameter set aiming at NIST security level 1. There is a large body of work targeting the Cortex-M4 on the other NISTPQC finalists Dilithium [GKS20], Falcon [Por19], Kyber [ABCG20], Saber [KRS19, BMKV20, CHK+21], and NTRU [KRS19, CHK+21] These lattice-based finalists use modular integer arithmetic and polynomial multiplication for relatively many coefficients, which means their optimizations do not carry over to Rainbow. The only M4 work on Rainbow to our knowledge was from Moya Riera’s Bachelor thesis [MR19], which optimized level 1 parameter sets of the second round Rainbow This implementation used look-up tables for F16 arithmetic which is not constant-time on all Cortex-M4 platforms, and (despite the smaller parameters) was considerably slower than our work.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have