Pseudo-random streams for distributed and parallel stochastic simulations on GP-GPU

J Passerat-Palmbach,C Mazel,D R C Hill

doi:10.1057/jos.2012.8

Abstract

Random number generation is a key element of stochastic simulations. It has been widely studied for sequential applications purposes, enabling us to reliably use pseudo-random numbers in this case. Unfortunately, we cannot be so enthusiastic when dealing with parallel stochastic simulations. Many applications still neglect random stream parallelization, leading to potentially biased results. In particular parallel execution platforms, such as Graphics Processing Units (GPUs), add their constraints to those of Pseudo-Random Number Generators (PRNGs) used in parallel. This results in a situation where potential biases can be combined with performance drops when parallelization of random streams has not been carried out rigorously. Here, we propose criteria guiding the design of good GPU-enabled PRNGs. We enhance our comments with a study of the techniques aiming to parallelize random streams correctly, in the context of GPU-enabled stochastic simulations.

Highlights

Random number generation is a key element of stochastic simulations
High quality Pseudo-Random Number Generators (PRNGs) belong to this category and have existed for more than a decade, some recent publications dealing with GP-Graphics Processing Units (GPUs) implementations of PRNGs still propose old and weak generators
It implies taking into consideration two different domains: General-Purpose computation on Graphics Processing Units (GP-GPU) programming and PRNG parallelization techniques

Summary

Pseudorandom Number Generation on GPU

This section will survey the major propositions that can be found in the literature about PRNGs implementations on GPU platforms. Introduced in the latest version, at the time of writing, of the CUDA framework, CURAND NVIDIA (2010a) has been designed to generate random numbers in a straightforward way on CUDA-enabled GPUs. The main advantage of CURAND is that it is able to produce both quasi-random and pseudorandom sequences, either on GPU or on CPU. ShoveRand offers a common API to users, whichever PRNG they select, and it guides developers who would like to integrate a new generator to the framework The latter performs compile-time analysis on the provided source code to ensure that only PRNG implementations which public interface matches our guidelines will compile successfully. GPU-enabled algorithms need to repeat the same operation on different data to correctly exploit the device This is the main reason of the recent dedicated PRNGs proposals. Implementing PRNGs in a way to draw numbers directly on GPU led us to think about the best design of such pieces of software, considering both PRNG characteristics and GPU constraints

GP-GPU specific criteria for PRNGs design

Location of PRNGs’ internal data structures on GP-GPU

GP-GPU specific requirements for random streams parallelization

Random streams parallelization techniques fitting GPGPUs

Findings

Conclusion