Abstract

Abstract. In this paper, we present a parallel version of the finite-element model of the Arctic Ocean (FEMAO) configured for the White Sea and based on MPI technology. This model consists of two main parts: an ocean dynamics model and a surface ice dynamics model. These parts are very different in terms of the number of computations because the complexity of the ocean part depends on the bottom depth, while that of the sea-ice component does not. In the first step, we decided to locate both submodels on the same CPU cores with a common horizontal partition of the computational domain. The model domain is divided into small blocks, which are distributed over the CPU cores using Hilbert-curve balancing. Partitioning of the model domain is static (i.e., computed during the initialization stage). There are three baseline options: a single block per core, balancing of 2D computations, and balancing of 3D computations. After showing parallel acceleration for particular ocean and ice procedures, we construct the common partition, which minimizes joint imbalance in both submodels. Our novelty is using arrays shared by all blocks that belong to a CPU core instead of allocating separate arrays for each block, as is usually done. Computations on a CPU core are restricted by the masks of non-land grid nodes and block–core correspondence. This approach allows us to implement parallel computations into the model that are as simple as when the usual decomposition to squares is used, though with advances in load balancing. We provide parallel acceleration of up to 996 cores for the model with a resolution of 500×500×39 in the ocean component and 43 sea-ice scalars, and we carry out a detailed analysis of different partitions on the model runtime.

Highlights

  • The increasing performance and availability of multiprocessor computing devices make it possible to simulate complex natural systems with high resolution, while taking into account important phenomena and coupling comprehensive models of various subsystems

  • The finite-element model of the Arctic Ocean (FEMAO; Iakovlev, 1996, 2012) has been developed since the 1990s and it has been adjusted to the White Sea (Chernov, 2013; Chernov et al, 2018)

  • Grid partition is performed in two steps: the model domain is decomposed into small blocks, and these blocks are distributed over CPU cores in such a way that computational load imbalance is minimized

Read more

Summary

Introduction

The increasing performance and availability of multiprocessor computing devices make it possible to simulate complex natural systems with high resolution, while taking into account important phenomena and coupling comprehensive models of various subsystems. The baseline strategy is to decompose the domain into squares (Madec et al, 2008) or into small blocks, with consequent distribution over the processor cores (Dennis, 2007, 2003; Chaplygin et al, 2019). Both approaches preserve the original structure of the loops and utilize the direct referencing of neighboring grid nodes on rectangular grids.

The White Sea
The model and the software
Organization of the calculations in non-parallel code
Modifications of the non-parallel code
Trivial 1block partition
Hilbert-curve partition
Data structure and MPI exchanges
Parallel solver of the SLAE
Numerical experiments
Speedup of scalar and ice advection
Speedup of the full model
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call