Abstract

It has become a trend for Direct Simulation Monte Carlo (DSMC) to simulate near-continuum gas, which makes the problem of a large amount of computation more prominent in turn. Moreover, heterogeneous architecture is becoming mainstream for modern supercomputers. Speeding DSMC on heterogeneous systems has been a must consideration for DSMC researchers. However, little research on the DSMC program that supports different heterogeneous systems is available. In this paper, we developed a DSMC program supporting heterogeneous systems through OpenMP4.5, and our codes can execute on many accelerators in theory such as Nvidia GPUs, AMD GPUs and Intel Many Integrated Core (MIC) products. The implementation of our heterogeneous DSMC is elaborated first, and then we explain the algorithm proposed to generate random numbers on accelerators. In addition, the optimization methods, including load-balance and granularity refinement strategy, branch elimination, and particles’ removal and entering, are described in detail, respectively. A shock-bubble interaction (SBI) case under Knudsen number of [Formula: see text] is utilized to validate the accuracy of our DSMC and do the performance analysis. An Nvidia RTX 2080 serves as a device, and the results show that our program can not only gain accurate flowfield information, but also perform well under different computing loads. More specifically, the whole speedup ratio of [Formula: see text] can be obtained relative to a single CPU core, and for the subroutine Collision, the ratio can even reach approximately 90.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call