Abstract

Masking is a well loved and widely deployed countermeasure against side channel attacks, in particular in software. Under certain assumptions (w.r.t. independence and noise level), masking provably prevents attacks up to a certain security order and leads to a predictable increase in the number of required leakages for successful attacks beyond this order. The noise level in typical processors where software masking is used may not be very high, thus low masking orders are not sufficient for real world security. Higher order masking however comes at a great cost, and therefore a number techniques have been published over the years that make such implementations more efficient via parallelisation in the form of bit or share slicing. We take two highly regarded schemes (ISW and Barthe et al.), and some corresponding open source implementations that make use of share slicing, and discuss their true security on an ARM Cortex-M0 and an ARM Cortex-M3 processor (both from the LPC series). We show that micro-architectural features of the M0 and M3 undermine the independence assumptions made in masking proofs and thus their theoretical guarantees do not translate into practice (even worse it seems unpredictable at which order leaks can be expected). Our results demonstrate how difficult it is to link theoretical security proofs to practical real-world security guarantees.

Highlights

  • Nowadays, masking is arguably one of the most prevalent countermeasures against side channel analysis

  • We demonstrate that the “independent leakage” assumption does not hold on an ARM Cortex M0 and M3

  • 3Strictly speaking, it still depends on the specific leakage model: in practice, we found this approach provided a better signal-to-noise radio (SNR) in most cases

Read more

Summary

Introduction

Nowadays, masking is arguably one of the most prevalent countermeasures against side channel analysis. Despite the fact that the authors did NOT directly link this scheme with any specific software or hardware implementation, due to its intrinsic parallel feature and good performance for higher-order masking [JS17], it has become a prevalent choice for a few bit-sliced masking implementations [JS17, GJRS18] These implementations stipulate that all shares of secret x should be stored within one register (denoted as “share-slicing” in this paper), which is a less-investigated option in the previous studies of software masking [JS17]. As the XOR operation has a trivial shared form, we can decompose the Sbox into such gadgets and apply our side channel protection in a divide-and-conquer manner By default, this approach fits the customized hardware better, as the proposed scheme is designed for 1-bit computing unit. Note that even in this case, as field computation is hardly feasible for the same bit-width of modern CPUs, in practice these schemes might still come with some bit-slicing effort [GR17]

Bit-slicing
Masking: the share allocation problem
Well-known threats
ARM Cortex-M family
Specific Example
Instruction-wise leakage analysis on bit-interaction
A masked AES with share-slicing
Attacks on a 2-share masked implementation
Attacks on a 4-share masked implementation
Advantages of Share slicing
Disadvantages of Share slicing
Conclusion
B Leakage analysis on other instructions
C ELMO evaluation
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call