This paper focuses on enhancing the performance of the Nth-degree truncated-polynomial ring units key encapsulation mechanism (NTRU-KEM) algorithm, which ensures post-quantum resistance in the field of key establishment cryptography. The NTRU-KEM, while robust, suffers from increased storage and computational demands compared to classical cryptography, leading to significant memory and performance overheads. In environments with limited resources, the negative impacts of these overheads are more noticeable, leading researchers to investigate ways to speed up processes while also ensuring they are efficient in terms of area utilization. To address this, our research carefully examines the detailed functions of the NTRU-KEM algorithm, adopting a software/hardware co-design approach. This approach allows for customized computation, adapting to the varying requirements of operational timings and iterations. The key contribution is the development of a novel hardware acceleration technique focused on optimizing bus utilization. This technique enables parallel processing of multiple sub-functions, enhancing the overall efficiency of the system. Furthermore, we introduce a unique integrated register array that significantly reduces the spatial footprint of the design by merging multiple registers within the accelerator. In experiments conducted, the results of our work were found to be remarkable, with a time-area efficiency achieved that surpasses previous work by an average of 25.37 times. This achievement underscores the effectiveness of our optimization in accelerating the NTRU-KEM algorithm.
Read full abstract