Abstract
A microstructure numerical model is an intensive computational problem, for which the simulation time is too long and the simulation scale is too small. To solve these two problems, in this article, we use MPI+CUDA hybrid particle heterogeneous parallel computing to implement the dendrite growth simulation of a PF-LBM phase-field 3D model. Message Passing Interface (MPI) can be used to conduct coarse granularity division, to break through the limitation of the simulate scale in a single machine. In each node, fine-grained division is implemented by the Compute Unified Device Architecture (CUDA) parallel way to realize the completely parallelism intra-node, and to improve overall computational efficiency. At the same time, in this article, the "pseudo three-dimensional array" programming method is brought up in CUDA programming, and also to improve the CUDA random number generation method, in order to simplify the CUDA array programming and reduce the CUDA random number generation time purposes. Experiments show that at the same simulation scale, the speed-up ratio with 21 nodes MPI+CUDA was 57, which was increased 54% over the 21 nodes MPI. Under the condition of computing efficiency close, the largest simulation scale with 21 nodes MPI+CUDA was 4203, which is 13 times to single GPU. Therefore, the MPI + CUDA hybrid granularity parallel method proposed in this paper also has the advantages of high computational efficiency of the GPU and MPI to expand the simulation scale.
Highlights
The phase-field method is one of the most potential methods for the numerical simulation of a microstructure’s dendrite growth.[1]
The three-dimensional branches of the coupled flow-field (PF-Lattice Boltzmann Method (LBM)) in the phasefield are used in the same simulation scale (168 × 168 × 168) using different methods of single CPU, single GPU, Message Passing Interface (MPI) and MPI + Compute Unified Device Architecture (CUDA) crystal growth model with the simulation conditions and the parameters are shown in Table I and Table II, the velocity of flow is 0.007m/s and the initial temperature is 900k
By using MPI + CUDA mixed granularity parallel, the growth of dendrites in the direction of the flow-field is fast and large, and it is concluded that the simulation is in good agreement with reality
Summary
The phase-field method was used to accelerate the solidification of the dendrite of a two alloy GPU on a single NVIDIA TESLA C1060 by Yamanaka A et al.[7] When the computation scale was 5763, the speedup achieved was by 100 times. The above researchers only used MPI and CUDA to simulate the microstructure evolution of two alloys without flow, the simulation of the coupled flow-field in the phase-field model by using the parallelization method was not achieved, and have not attempted to combine MPI with the GPU to expand the scale of the simulation with high computational efficiency. We will use a MPI+CUDA hybrid full grain parallel method to simulate the PF-LBM 3D dendrite growth model. The simulation results, the simulation efficiency and the simulation scale are compared with the single MPI and CUDA parallel simulation methods, and obtain its advantages
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have