Serialized lightweight SHA-3 FPGA implementations

Bernhard Jungk,Marc Stöttinger

doi:10.1016/j.micpro.2019.102857

Abstract

In this article, we extend our study of lightweight FPGA implementations of SHA-3 published at ReConFig 2016. We use the shallow pipeline optimization technique for the slice-oriented SHA-3 architecture developed previously and examine additional aspects. Firstly, we adapt the implementation to the state organization proposed by Winderickx et al. based on shift register primitives available on Xilinx FPGA platforms. Secondly, we study the usage of block RAM instead of distributed RAM for the original designs. The shallow pipeline optimization already has reduced the area to about 90 slices for both Virtex-5 and Virtex-6 FPGAs. This is a significant improvement over the previous state of the art.On the one hand, our additional results show that the optimized state representation by Windericks et al. using shift registers does not improve the performance at all, compared to the solution based on distributed RAM. The main reason for this is the implementation of the ρ function, which requires different offsets for the rotations to be implemented and also larger shift registers for most lanes than the 64 bits of a lane. Together, this leads to a higher than expected area consumption for the shift register approach, which leads to a very similar total area consumption than the RAM based approach. On the other hand, the block RAM solution shows a considerable reduction of the slice utilization from about 88 to only 54 slices at the expense of 13 to 14 block RAMs. However, at the same time the achievable maximum clock frequency is considerably lower, because of the additional routing delays from and to the block RAM.

Full Text