Abstract

The fixslicing implementation strategy was originally introduced as a new representation for the hardware-oriented GIFT block cipher to achieve very efficient software constant-time implementations. In this article, we show that the fundamental idea underlying the fixslicing technique is not of interest only for GIFT, but can be applied to other ciphers as well. Especially, we study the benefits of fixslicing in the case of AES and show that it allows to reduce by 52% the amount of operations required by the linear layer when compared to the current fastest bitsliced implementation on 32-bit platforms. Overall, we report that fixsliced AES-128 allows to reach 80 and 91 cycles per byte on ARM Cortex-M and E31 RISC-V processors respectively (assuming pre-computed round keys), improving the previous records on those platforms by 21% and 26%. In order to highlight that our work also directly improves masked implementations that rely on bitslicing, we report implementation results when integrating first-order masking that outperform by 12% the fastest results reported in the literature on ARM Cortex-M4. Finally, we demonstrate the genericity of the fixslicing technique for AES-like designs by applying it to the Skinny-128 tweakable block ciphers.

Highlights

  • Since the selection of the Rijndael block cipher as the Advanced Encryption Standard (AES) [DR02] in 2001, optimized implementations of this algorithm attracted a lot of interest over the past two decades

  • We report that it is possible to reach 81 and 79 cpb for AES-128 on ARM Cortex-M and E31 reduced instruction set computer (RISC)-V processors respectively, smashing the previous results on those platforms by 20% and 36%

  • In order to come up with an implementation that is more appropriate to resource-constrained devices, we applied the concept of fixslicing to the AES and shown that a total omission of the ShiftRows allows to reduce the number of operations spent by the linear layer over 4 rounds by 52%

Read more

Summary

Introduction

Since the selection of the Rijndael block cipher as the Advanced Encryption Standard (AES) [DR02] in 2001, optimized implementations of this algorithm attracted a lot of interest over the past two decades. A similar approach named fixslicing [ANP20] has been applied to the GIFT family of block ciphers [BPP+17], enhancing the performance by a factor 7 on ARM Cortex-M when compared to naive bitslicing Those works highlight that the performance of a bitsliced implementation depends on the way the bits are packed within registers, and on possible alternative representations of the cipher. We report that fixsliced AES-128 reaches 80 and 91 cpb on ARM Cortex-M and E31 RISC-V processors respectively (assuming pre-computed round keys), improving the previous records on those platforms by 21% and 26% Those results require the ability to process two blocks simultaneously and apply to all parallelizable modes of operation (e.g. CTR, GCM). All our implementations are available in the public domain at https://github.com/aadomn/aes

AES overview
Bitslicing the AES
A new ShiftRows-friendly representation
Fixslicing the AES
Application to the round function
Application to the key expansion
Implementation results
ARM Cortex-M
Interpretation and discussions
Taking first-order masking into consideration
Application to another AES-like design
Conclusion
Findings
A Fixslicing a single block for Skinny-128
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call