Abstract

The aim of this paper is to present a new high-performance implementation of Marsa-LFIB4 which is an example of high-quality multiple recursive pseudorandom number generators. We propose a new algorithmic approach that combines language-based vectorization techniques together with a new divide-and-conquer method that exploits a special sparse structure of the matrix obtained from the recursive formula that defines the generator. We also show how the use of intrinsics for Intel AVX2 and AVX512 vector extensions can improve the performance. Our new implementation achieves good performance on several multicore architectures and it is much more energy-efficient than simple SIMD-optimized implementations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call