Abstract
High performance and efficiency for parallel computing has significance in large scale discrete element method (DEM) simulation. After analyzing a simulation framework of DEM built on a Graphic Processor Unit (GPU) platform with CUDA architecture and evaluating the simulated data, we propose three optimization methods to improve the performance of a system. A stencil computation model is applied to the particle searching and calculation of forces based on gridding to formulate the structure in the particle-particle contact and neighboring particle searching. In addition, a reasonable and effective parallel granularity is sought out by altering the number of blocks and threads on GPU. A shared-memory environment is set up for data prefetching and storing the results of intermediate calculations by a rational analysis and calculations. The results of the experiment show that the stencil model is useful for the particle searching and calculation of forces and the rational parallel granularity as well as the fair use of shared memory optimizes the performance of the DEM simulation framework.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have