Abstract

Abstract. Communications in distributed memory supercomputers are still limiting scalability of geophysical models. Considering the recent trends of the semiconductor industry, we think this problem is here to stay. We present the optimizations that have been implemented in the 4.0 version of the ocean model NEMO to improve its scalability. Thanks to the collaboration of oceanographers and HPC experts, we identified and removed the unnecessary communications in two bottleneck routines, the computation of free surface pressure gradient, and the forcing in the straight or unstructured open boundaries. Since a wrong parallel decomposition choice could undermine computing performance, we impose its automatic definition in all cases, including when subdomains containing land points only are excluded from the decomposition. For a smaller audience of developers and vendors, we propose a new benchmark configuration, which is easy to use while offering the full complexity of operational versions.

Highlights

  • There is no longer a need to justify the importance of climate research for our societies (Masson-Delmotte et al, 2018)

  • This work complements the report of Maisonnave and Masson (2019) by presenting the new HPC optimizations that have been implemented in NEMO 4.0, the current reference version of the code

  • The choice of the domain decomposition proposed by default in NEMO up until version 3.6 was very basic as 2 was the only prime factor considered when looking at divisors of the number of MPI tasks

Read more

Summary

Introduction

There is no longer a need to justify the importance of climate research for our societies (Masson-Delmotte et al, 2018). The numerical performance of climate models is key and must be kept at the best possible level in order to minimize the time-to-solution and the energyto-solution. This work must improve the performance while preserving the code accessibility for climate scientists who use and develop it but are not necessarily experts in computing sciences. This optimization work lies within this framework and we gathered, in this study, authors with very complementary profiles: oceanographers, NEMO developers, specialized engineers in climate modeling and frontier simulations, and pure HPC engineers. We first describe the new features that have been added to the code in order to support the optimization work The last section (Sect. 4) discusses and concludes this work

Optimum dynamic sub-domain decomposition
Optimal domain decomposition research algorithm
Getting land–sea mask information
Getting the best domain decomposition sorted from 1 to Nsubmax subdomains
Additional optimization to minimize the impact of the North Pole folding
The BENCH configuration
BENCH general description
BENCH flexibility
Dedicated tool for communication cost measurement
Reducing or removing unnecessary MPI communications
Free surface computation optimization
Open-boundary communication optimization
Straight open boundaries along domain edges
Unstructured open boundaries
Performance improvement
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call