Abstract

Finite-time Lyapunov exponent (FTLE) is widely used to extract coherent structure of unsteady flow. However, the calculation of FTLE can be highly time-consuming, which greatly limits the application's performance efficiency. In this paper, we accelerate a double precision PDE-based FTLE application for two- and three-dimensional analytical flow field on Intel multi-core and many-core architectures such as Intel Sandy Bridge and Intel Many Integrated Core (MIC)coprocessor. Through analysis of the calculation processes of FTLE and the characteristics of Intel multi-core and many-core architectures, we employ three categories of optimization techniques, namely, thread parallelism for multi-/many-core scaling, data parallelism to exploit SIMD (single-instruction multiple-data) mechanism and improving on-chip data reuse, to maximize the performance. Also, the hardware performance metrics through an open source performance analysis tool, in order to explain performance difference between Sandy Bridge and MIC, are discussed. The experiment results show that our MIC-enabled FTLE achieves about 1.8× speed-ups relative to a parallel computation on two Intel Sandy Bridge CPUs, and perfect parallel efficiency is also observed from the experiment results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call