Abstract

The library PRAND for pseudorandom number generation for modern CPUs and GPUs is presented. It contains both single-threaded and multi-threaded realizations of a number of modern and most reliable generators recently proposed and studied in Barash (2011), Matsumoto and Tishimura (1998), L’Ecuyer (1999,1999), Barash and Shchur (2006) and the efficient SIMD realizations proposed in Barash and Shchur (2011). One of the useful features for using PRAND in parallel simulations is the ability to initialize up to 1019 independent streams. Using massive parallelism of modern GPUs and SIMD parallelism of modern CPUs substantially improves performance of the generators. Program summaryProgram title: PRANDCatalogue identifier: AESB_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AESB_v1_0.htmlProgram obtainable from: CPC Program Library, Queen’s University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 45979No. of bytes in distributed program, including test data, etc.: 23953564Distribution format: tar.gzProgramming language: Cuda C, Fortran.Computer: PC, workstation, laptop, or server with NVIDIA GPU (tested on Tesla X2070, Fermi C2050, GeForce GT540M) and with Intel or AMD processor.Operating system: Linux with CUDA version 5.0 or later. Should also run on MacOs, Windows, or UNIX.RAM: 4 MbytesClassification: 4.13.Nature of problem:Any calculation requiring uniform pseudorandom number generator, in particular, Monte Carlo calculations. Any calculation or simulation requiring uncorrelated parallel streams of uniform pseudorandom numbers.Solution method:The library contains realization of a number of modern and reliable generators: MT19937, MRG32K3A and LFSR113. Also new realizations of the method based on parallel evolution of an ensemble of transformations of two-dimensional torus are included in the library: GM19, GM29, GM31, GM61, GM55, GQ58.1, GQ58.3 and GQ58.4. The library contains: single-threaded and multi-threaded realizations for GPU, single-threaded realizations for CPU, realizations for CPU based on SSE command set. Also, the library contains the abilities to jump ahead inside RNG sequence and to initialize independent random number streams with block splitting method for each of the RNGs.Restrictions:Nvidia Cuda Toolkit version 5.0 or later should be installed. To use GPU realizations, Nvidia GPU supporting CUDA and the corresponding Nvidia driver should be installed. For SSE realizations of the generators, Intel or AMD CPU supporting SSE2 command set is required. In order to use the SSE realization of LFSR113, CPU must support SSE4 command set.Additional comments:A version of this program, which only contains the realizations for CPUs, is held in the Library as Catalog Id., AEIT_v2_0 (RNGSSELIB). It does not require a GPU device or CUDA compiler.Running time:The tests and the examples included in the package take less or about one minute to run. Running time is analyzed in detail in Section 8 of the paper.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call