Performance improvements of differential operators code for MPS method on GPU

Kohei Murotani,Issei Masaie,Ryuji Shioya,Seiichi Koshizuka,Takuya Matsunaga,Masao Ogino,Toshimitsu Fujisawa

doi:10.1007/s40571-015-0059-2

Abstract

In the present study, performance improvements of the particle search and particle interaction calculation steps constituting the performance bottleneck in the moving particle simulation method are achieved by developing GPU-compatible algorithms for many core processor architectures. In the improvements of particle search, bucket loops of the cell-linked list are changed to a loop structure having fewer local variables and the linked list and the forward star of particle search algorithms within a bucket are compared. In the particle interaction calculation, the problem of the ratio of particles within the interaction domain to the neighboring particle candidates being quite low is improved. By these improvements, a performance efficiency of 24.7 % can be achieved for the first-order polynomial approximation scheme using NVIDIA Tesla K20, CUDA-6.5, and double-precision floating-point operations.

Full Text