Abstract
This paper proposes two different methods to perform NTT-based polynomial multiplication in polynomial rings that do not naturally support such a multiplication. We demonstrate these methods on the NTRU Prime key-encapsulation mechanism (KEM) proposed by Bernstein, Chuengsatiansup, Lange, and Vredendaal, which uses a polynomial ring that is, by design, not amenable to use with NTT. One of our approaches is using Good’s trick and focuses on speed and supporting more than one parameter set with a single implementation. The other approach is using a mixed radix NTT and focuses on the use of smaller multipliers and less memory. On a ARM Cortex-M4 microcontroller, we show that our three NTT-based implementations, one based on Good’s trick and two mixed radix NTTs, provide between 32% and 17% faster polynomial multiplication. For the parameter-set ntrulpr761, this results in between 16% and 9% faster total operations (sum of key generation, encapsulation, and decapsulation) and requires between 15% and 39% less memory than the current state-of-the-art NTRU Prime implementation on this platform, which is using Toom-Cook-based polynomial multiplication.
Highlights
Due to the ongoing advances in quantum computing, the threat by quantum computers to IT-security becomes more and more imminent: Experts predict that sufficiently large and stable quantum computers running Shor’s algorithm for factorization and solving discrete logarithms may be able to break currently wide-spread asymmetric cryptographic primitives in the ten to fifteen years
In the research field Post-Quantum Cryptography (PQC), researchers haven been investigating alternative cryptographic schemes that are believed to be secure against attacks aided by quantum computers
PQC-primitives based on lattice problems have attracted significant attention due to their efficient implementations that often are on par with or even better than current cryptographic schemes
Summary
Due to the ongoing advances in quantum computing, the threat by quantum computers to IT-security becomes more and more imminent: Experts predict that sufficiently large and stable quantum computers running Shor’s algorithm for factorization and solving discrete logarithms may be able to break currently wide-spread asymmetric cryptographic primitives in the ten to fifteen years. In contrast to other lattice-based schemes, which commonly use either cyclotomic polynomials to enable the use of an NTT or power-of-two moduli for efficient coefficientwise operations, it is a challenging task to implement NTRU Prime efficiently due to its specific design This challenge becomes even bigger on embedded devices with low resources in regard to computational power and memory. We implement two approaches of polynomial multiplication for NTRU Prime based on NTT on the Cortex-M4 architecture and show that our approaches are faster and more memory efficient than the current state-of-the-art. The current state-of-the-art implementation of NTRU Prime on Cortex-M4 is the work by Yang et al. that was included as optimized implementation for NTRU Prime in the pqm project in April 20203 It is using Toom-Cook for polynomial multiplication, fast modular inversion from [BY19], and platform-specific, hand-written assembly optimization.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IACR Transactions on Cryptographic Hardware and Embedded Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.