Abstract

The main contribution of this paper is to present hardware algorithms for redundant radix-2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">r</sup> number system in the FPGA to accelerate Montgomery modulo multiplication with many bits, which have applications in security systems such as RSA encryption and decryption. Quite surprisingly, our hardware algorithm for Montgomery modulo multiplication of two dr-bit numbers can be completed in only d+1 clock cycles. Since most FPGAs have 18-bit multipliers and 18 k-bit block RAMs, it makes sense to let r=16. Our hardware algorithm for Montgomery modulo multiplication for 256-bit numbers runs only 17 clock cycles using redundant radix-64 k (i.e.radix-2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">16</sup> ) number system. The experimental results for Xilinx Virtex-II Pro Family FPGA XC2VP100-6 show that the clock frequency of our circuit is independent of d. Further, the hardware algorithm for 1024-bit Montgomery modulo multiplication using the redundant number system is 3 times faster than that using the conventional number system. Also, for 256-bit Montgomery modulo multiplication, our hardware algorithm runs in 0.322 mus, while a previously known implementation runs in 1.22 mus although our implementation uses less than a half slices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call