Implementation of RSA Signatures on GPU and CPU Architectures

Eduardo Ochoa-Jimenez,Francisco Rodriguez-Henriquez,Luis Rivera-Zamarripa,Nareli Cruz-Cortes

doi:10.1109/access.2019.2963826

Eduardo Ochoa-Jimenez, Francisco Rodriguez-Henriquez + Show 2 more

Open Access

https://doi.org/10.1109/access.2019.2963826

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 16	License type: CC BY 4.0

Affiliation: Instituto Politécnico Nacional

Abstract

This paper reports a constant-time CPU and GPU software implementation of the RSA exponentiation by using algorithms that offer a first-line defense against timing and cache attacks. In the case of GPU platforms the modular arithmetic layer was implemented using the Residue Number System (RNS) representation. We also present a CPU implementation of an RNS-based arithmetic that takes advantage of the parallelism provided by the Advanced Vector Extensions 2 (AVX2) instructions. Moreover, we carefully analyze the performance of two popular RNS modular reduction algorithms when implemented on many- and multi-core platforms. In the case of CPU platforms we also report that a combination of the schoolbook and Karatsuba algorithms for integer multiplication along with Montgomery reduction, yields our fastest modular multiplication procedure. In comparison with previous literature, our software library achieves faster timings for the computation of the RSA exponentiation using 1024-, 2048- and 3072-bit private keys.

Highlights

Public key cryptosystems play an important role in communication systems that require the exchange of sensitive information
We focus our attention on the efficient parallel computation of s1 and s2 in General Processing Units (GPUs) and Central Processing Units (CPUs) software implementations
1) VECTOR INSTRUCTIONS In order to perform an efficient implementation of the Residue Number System (RNS) based arithmetic as described in Section §II-C, we took advantage of the Advanced Vector Extensions 2 (AVX2) instruction set introduced in the Intel Haswell micro-architecture [31]

Summary

INTRODUCTION

Public key cryptosystems play an important role in communication systems that require the exchange of sensitive information. We focus our attention on the efficient parallel computation of s1 and s2 in GPU and CPU software implementations. On the other hand, taking advantage of their massive parallelism, General Processing Units (GPUs) platforms have become an interesting option to speedup high demanding computational tasks such as the computation of several public key cryptographic primitives. OUR CONTRIBUTIONS: In this work, two RSA constanttime software implementations for 1024-, 2048-, and 3072-bit RSA keys, are presented. Our CPU software implementation of RSA uses a combination of integer arithmetic algorithms and Montgomery reduction that helped us to exploit the fine-grained parallelism present in the latest Intel micro-architectures. The experimental results presented in this work outperform previously reported GPU RSA implementations [7]–[10] by a factor of 1.24, 1.27 and 2.98 for RSA-1024 bits, RSA-2048 bits, and RSA-3072, respectively.

ARITHMETIC BACKGROUND

CONSTANT-TIME MODULAR EXPONENTIATION

MONTGOMERY MODULAR ARITHMETIC

RNS MODULAR ARITHMETIC

RNS product Addition of σ -bit terms 16: for each processing unit i do

EFFICIENT IMPLEMENTATION OF RSA ON GPU PLATFORMS

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Implementation of RSA Signatures on GPU and CPU Architectures

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Efficient Montgomery Modular Multiplication by using Residue Number System
Elham Khani
INTERNATIONAL JOURNAL OF MANAGEMENT & INFORMATION TECHNOLOGY | VOL. 2
Elham KhaniElham Khani
27 Nov 2012
INTERNATIONAL JOURNAL OF MANAGEMENT & INFORMATION TECHNOLOGY | VOL. 2

Effective Implementation of Matrix–Vector Multiplication on Intel's AVX multicore Processor
Somaia A Hassan ... Mahmoud A Saber
Computer Languages, Systems & Structures | VOL. 51
Somaia A Hassan, et. al.Somaia A Hassan ... Mahmoud A Saber
30 Jun 2017
Computer Languages, Systems & Structures | VOL. 51

An Evaluation of Power Side-Channel Resistance for RNS Secure Logic
Ravikumar Selvam ... Akhilesh Tyagi
Sensors | VOL. 22
Ravikumar Selvam, et. al.Ravikumar Selvam ... Akhilesh Tyagi
14 Mar 2022
Sensors | VOL. 22

Optimizing matrix-matrix multiplication on intel’s advanced vector extensions multicore processor
A.M Hemeida ... M.A Saber
Ain Shams Engineering Journal | VOL. 11
A.M Hemeida, et. al.A.M Hemeida ... M.A Saber
30 Jan 2020
Ain Shams Engineering Journal | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Implementation of RSA Signatures on GPU and CPU Architectures

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access