Abstract
Masking using randomised lookup tables is a popular countermeasure for side-channel attacks, particularly at small masking orders. An advantage of this class of countermeasures for masking S-boxes compared to ISW-based masking is that it supports pre-processing and thus significantly reducing the amount of computation to be done after the unmasked inputs are available. Indeed, the “online” computation can be as fast as just a table lookup. But the size of the randomised lookup table increases linearly with the masking order, and hence the RAM memory required to store pre-processed tables becomes infeasible for higher masking orders. Hence demonstrating the feasibility of full pre-processing of higher-order lookup table-based masking schemes on resource-constrained devices has remained an open problem.
 In this work, we solve the above problem by implementing a higher-order lookup table-based scheme using an amount of RAM memory that is essentially independent of the masking order. More concretely, we reduce the amount of RAM memory needed for the table-based scheme of Coron et al. (TCHES 2018) approximately by a factor equal to the number of shares. Our technique is based upon the use of pseudorandom number generator (PRG) to minimise the randomness complexity of ISW-based masking schemes proposed by Ishai et al. (ICALP 2013) and Coron et al. (Eurocrypt 2020). Hence we show that for lookup table-based masking schemes, the use of a PRG not only reduces the randomness complexity (now logarithmic in the size of the S-box) but also the memory complexity, and without any significant increase in the overall running time. We have implemented in software the higher-order table-based masking scheme of Coron et al. (TCHES 2018) at tenth order with full pre-processing of a single execution of all the AES S-boxes on a ARM Cortex-M4 device that has 256 KB RAM memory. Our technique requires only 41.2 KB of RAM memory, whereas the original scheme would have needed 440 KB. Moreover, our 8-bit implementation results demonstrate that the online execution time of our variant is about 1.5 times faster compared to the 8-bit bitsliced masked implementation of AES-128.
Highlights
An IoT ecosystem helps several heterogeneous devices to collect, send and act on the data acquired from their environments
We summarise the steps involved in the higher-order masked lookup table-based computation of an S-box using a robust psuedorandom generator (PRG) in Algorithm 4
Since the computation of index α depends on the number of pseudorandom values required per shift, we present the locality refresh (LR) variant for increasing shares using a strong (r, k, 1)-robust PRG in Algorithm 7
Summary
An IoT ecosystem helps several heterogeneous devices to collect, send and act on the data acquired from their environments. The randomness complexity is improved to O(k · k · n3) compared to O(k · 2k · n) for the original scheme We achieve this reduction in RAM memory using the following observation: the masked lookup table of higher-order lookup table-based scheme [CRZ18] requires k · 2k · n bits of RAM memory per each of the two tables that includes the temporary table. To implement a 10-th order secure 128-bit AES using the randomised lookup table scheme, we require 2 · 10 · 28 = 5120 bytes of RAM memory per S-box to store random masks. In addition to the improved security proof, [CRZ18] discusses another refinement to reduce the randomness complexity to approximately half of the original scheme The idea behind this improvement is that, while shifting the table using input share xi, the masked S-box values can be protected using only i−1 output masks, instead of n−1 masks as in the original scheme.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have