Random number generation is critical to many applications. Gaming, gambling, and particularly cryptography all require random numbers that are uniform and unpredictable. For testing whether supposedly random sources feature particular characteristics commonly found in random sequences, batteries of statistical tests are used. These are fundamental tools in the evaluation of random number generators and form part of the pathway to certification of secure systems implementing them. Although there have been previous studies into this subject (Becker, 2013), RNG manufacturers and vendors continue to use statistical tests known to be of dubious reliability, in their RNG verification processes. Our research shows that FIPS-140-2 cannot identify adversarial biases effectively, even very primitive ones. Concretely, this work illustrates the inability of the FIPS 140 family of tests to detect bias in three obviously flawed PRNGs. Deprecated by official standards, these tests are nevertheless still widely used, for example in hardware-level self-test schemes incorporated into the design of many True RNGs (TRNGs). They are also popular with engineers and cryptographers for quickly assessing the randomness characteristics of security primitives and protocols, and even with manufacturers aiming to market the randomness features of their products to potential customers. In the following, we present three <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">biased-by-design</i> RNGs to show in explicit detail how simple, glaringly obvious biases are not detected by any of the FIPS 140–2 tests. One of these RNGs is backdoored, leaking key material, while others suffer from significantly reduced unpredictability in their output sequences. To make our point even more straightforward, we show how files containing images can also fool the FIPS 140 family of tests. We end with a discussion on the security issues affecting an interesting and active project to create a randomness beacon. Their authors only tested the quality of their randomness with the FIPS 140 family of tests, and we will show how this has led them to produce predictable output that, albeit passing FIPS fails other randomness tests quite catastrophically.
Read full abstract