The cdugksFOAM program realizes the physical space grid and the velocity space grid in parallel at the same time. Its distinguishing feature lies in its potential for large-scale parallelism. However, the running time of the cdugksFOAM program is significantly dependent on the number of physical and velocity space partitions. In order to find the optimal partitioning strategies for a specific CFD problem running on a parallel computer, we performed performance modeling of the cdugksFOAM program. Firstly, we proposed a floating-point operations model, a MPI communication volume model, and a memory consumption model. Based on these models, we established a Roofline model to predict the computational time, and a model to predict communication time. According to the computational time model and the communication time model, the execution time model was proposed and its effectiveness was verified with two cases. Finally, the optimal running strategy that minimizes the product of the number of computing nodes and execution time was identified, providing meaningful guidance for the economic execution of the program.
Read full abstract