Optimization of atmospheric transport models on HPC platforms

Raúl De La Cruz,Arnau Folch,Pau Farré,Javier Cabezas,Nacho Navarro,José María Cela

doi:10.1016/j.cageo.2016.08.019

Raúl De La Cruz, Arnau Folch + Show 4 more

Open Access

https://doi.org/10.1016/j.cageo.2016.08.019

Copy DOI

Journal: Computers & Geosciences	Publication Date: Aug 27, 2016
Citations: 4	License type: cc-by-nc-nd

Affiliation: Barcelona Supercomputing Center

Abstract

The performance and scalability of atmospheric transport models on high performance computing environments is often far from optimal for multiple reasons including, for example, sequential input and output, synchronous communications, work unbalance, memory access latency or lack of task overlapping. We investigate how different software optimizations and porting to non general-purpose hardware architectures improve code scalability and execution times considering, as an example, the FALL3D volcanic ash transport model. To this purpose, we implement the FALL3D model equations in the WARIS framework, a software designed from scratch to solve in a parallel and efficient way different geoscience problems on a wide variety of architectures. In addition, we consider further improvements in WARIS such as hybrid MPI-OMP parallelization, spatial blocking, auto-tuning and thread affinity. Considering all these aspects together, the FALL3D execution times for a realistic test case running on general-purpose cluster architectures (Intel Sandy Bridge) decrease by a factor between 7 and 40 depending on the grid resolution. Finally, we port the application to Intel Xeon Phi (MIC) and NVIDIA GPUs (CUDA) accelerator-based architectures and compare performance, cost and power consumption on all the architectures. Implications on time-constrained operational model configurations are discussed.

Full Text