This article provides an enhanced parallelization of the WSM7 microphysics scheme for the Weather Research and Forecasting Model (WRF). The parallelization is designed to maximize the utilization of a heterogeneous computing system consisting of CPUs, GPUs or both. Therefore the reference implementation of the WSM7 scheme is re-implemented for the heterogeneous execution model. For each time step, a dynamic load distribution is introduced which balances the computational load between the two components aiming for an overall minimum execution time. The evaluation of the parallelized implementation is done for a specific weather situation. Specifically, the precipitation of the low-pressure zone “Bernd” from July 2021 is simulated using an Intel Core i7-7700 CPU and a NVIDIA GTX 1070 GPU. The results show a speedup of up to 28.51 for the GPU version in comparison with the reference implementation. The heterogeneous dynamic load balancing increases the speedup achieved even further by introducing a distribution factor that is updated for each time step.
Read full abstract