The article presents a strategy and its algorithm to compile a simulation-accompanying, incremental Singular Value Decomposition (SVD) for time-evolving, spatially parallel discrete data sets. The framework addresses state-of-the-art PDE solvers for computational science and engineering applications. An important characteristic of such applications is that the spatial size of the data is often time-invariant and significantly exceeds the temporal size due to the large computational grid in 3D applications. Typical examples, which are also considered in this article, relate to results extracted from unsteady flow simulations. Herein, the flow data, which progresses over time, is frequently calculated spatially parallel based on domain decomposition strategies, which allow to parallelize the simulation on distributed memory machines following a Single Instruction Multiple Data (SIMD) concept.With a view to the memory-efficient reuse of (compressed) simulation results and their CPU time-saving, sufficiently accurate generation, the paper scrutinizes the efficiency of incremental/parallel SVD approaches for such simulation examples. To improve the computational efficiency, the introduction of a bunch matrix is proposed, which enables the aggregation of multiple time steps and SVD updates, and significantly increases the efficiency. The suggested strategy is verified and validated by simple 2D laminar single-phase flows and subsequently applied to more complex 2D and 3D turbulent two-phase flows. Emphasis is given to (a) the accuracy of SVD-based reconstruction, (b) the physical realizability of the reconstructed quantities, (c) the independence of domain partitioning, (d) an efficient snapshot bunching, and (e) related implementation aspects. In addition, the influence of lower and (adaptive) upper rank thresholds on the effort and accuracy is evaluated. A final application renders the practical benefits of the approach and refers to a merchant ship in head waves at Re = 1.4×107 and Fn = 0.26. The simulation involves 2880 processor cores and the related full-rank snapshot matrix has (108×104) entries. With a numerical overhead of O(10%), this snapshot matrix can be incrementally generated and compressed by O(95%). The compression is accompanied by only small errors in the integral force and local wave elevation of O(10−2%). This qualifies the method for an efficient subsequent data processing.
Read full abstract