Abstract

In situ visualization on high-performance computing systems allows us to analyze simulation results that would otherwise be impossible, given the size of the simulation data sets and offline post-processing execution time. We develop an in situ adaptor for Paraview Catalyst and Nek5000, a massively parallel Fortran and C code for computational fluid dynamics. We perform a strong scalability test up to 2048 cores on KTH’s Beskow Cray XC40 supercomputer and assess in situ visualization’s impact on the Nek5000 performance. In our study case, a high-fidelity simulation of turbulent flow, we observe that in situ operations significantly limit the strong scalability of the code, reducing the relative parallel efficiency to only approx 21% on 2048 cores (the relative efficiency of Nek5000 without in situ operations is approx 99%). Through profiling with Arm MAP, we identified a bottleneck in the image composition step (that uses the Radix-kr algorithm) where a majority of the time is spent on MPI communication. We also identified an imbalance of in situ processing time between rank 0 and all other ranks. In our case, better scaling and load-balancing in the parallel image composition would considerably improve the performance of Nek5000 with in situ capabilities. In general, the result of this study highlights the technical challenges posed by the integration of high-performance simulation codes and data-analysis libraries and their practical use in complex cases, even when efficient algorithms already exist for a certain application scenario.

Highlights

  • Introduction and backgroundThe availability of high-performance computing (HPC) resources and efficient computational methods allow the study of complex turbulent flows via timedependent high-fidelity numerical simulations

  • The test case that we examined consists of a computational fluid dynamics (CFD) simulation of realistic size; alongside ParaView is employed for a standard visualization of vortex clusters in turbulent flows

  • We find that the parallel implementation of the Radix-kr algorithm [17, 22] in Paraview Catalyst is responsible for time spent in messagepassing interface (MPI) communication

Read more

Summary

Introduction

The availability of high-performance computing (HPC) resources and efficient computational methods allow the study of complex turbulent flows via timedependent high-fidelity numerical simulations. This type of flow is ubiquitous in nature as well as industrial applications, and it plays a crucial role in phenomena as diverse as atmospheric precipitations and the creation of the lift and drag forces acting on aircraft. A relevant example is the DNS of the flow around a wing profile in [11], which employed 2.3 × 109 grid points Carrying out these studies is challenging for two reasons: on the one hand, because of computational costs of the order of multiple millions of CPU hours, and on the other hand, because the datasets created by each simulation can be as large as tens of Terabytes

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call