Abstract

This paper introduces a novel speed-oriented architecture of point multiplication in elliptic curve cryptography. A balanced full-precision multiplier is proposed to shorten latency, and a new modular inversion architecture is integrated to reduce the total number of clock cycles in point multiplication. A modified Montgomery Ladder algorithm that takes three clock cycles to calculate one input bit is proposed to best utilize hardware resources. A mixed-pipeline technique is used to balance the delay of different paths and increase frequency. The proposed architecture is implemented on GF(2163) and GF(2571), based on Xilinx Virtex-5 and Virtex-7 FPGA. For GF(2163), the design reaches 211 MHz, with 29309 LUTs, and 547 clock cycles or $2.6~\mu \text{s}$ latency on Virtex-5; 320.5 MHz, with 28911 LUTs and $1.7~\mu \text{s}$ latency on Virtex-7. For GF(2571), the design reaches 186 MHz, with 286400 LUTs, and 1813 clock cycles or $9.6~\mu \text{s}$ latency on Virtex-5; 267 MHz, 290001 LUTs and $6.79~\mu \text{s}$ latency on Virtex-7. The proposed design achieves the lowest latency among all existing works, and its performance is also among the top. Furthermore, it is demonstrated that the proposed architecture maintains a high speed for larger binary fields, making it more suitable to be implemented in large-bit-length platforms with a higher security level. Since the multiplier and its segments work in different bit-length and refer to different fields, the proposed architecture can also be upgraded to a reconfigurable design to support multiple-field point multiplication in the future.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call