Abstract

In this paper, we develop and parallelize a CFD solver that supports overlapped meshes on multiple MIC architectures by using multithreaded technique. We optimize the solver through several considerations including vectorization, memory arrangement, and an asynchronous strategy for data exchange on multiple devices. Comparisons of different vectorization strategies are made, and the performances of core functions of the solver are reported. Experiments show that about 3.16x speedup can be achieved for the six core functions on a single Intel Xeon Phi 5110P MIC card, and 5.9x speedup can be achieved using two cards compared to an Intel E5-2680 processor for two ONERA M6 wings case.

Highlights

  • Computing with accelerators such as graphics processing unit (GPU) [1] and Intel many integrated core (MIC) architecture [2] has been attractive in computational fluid dynamics (CFD) areas recent years because it provides researchers with the possibility of accelerating or scaling their numerical codes by various parallel techniques

  • Our experiments were conducted on the YUAN cluster at the Computer Network Information Center at the Chinese Academy of Sciences. e cluster is of hybrid architecture that consists of both MIC and GPU nodes. e configuration of MIC nodes is that each node has two Intel E5-2680 V2 (Ivy Bridge, 2.8 GHz, 10 cores) CPUs and two Intel Xeon Phi 5110P MIC coprocessors. e memory capacity for the host and coprocessors is 64 GB and 8 GB, respectively

  • We used two ONERA M6 wings, each of which was configured with four 129 × 113 × 105 subblocks. e lower wing and its mesh system were formed by making a translation of the upper wing down along Y-axis by the length of the wing, and the two mesh systems overlapped with each other

Read more

Summary

Introduction

Computing with accelerators such as graphics processing unit (GPU) [1] and Intel many integrated core (MIC) architecture [2] has been attractive in computational fluid dynamics (CFD) areas recent years because it provides researchers with the possibility of accelerating or scaling their numerical codes by various parallel techniques. Intel MIC architecture consists of processors that inherit many key features of Intel CPU cores, which makes the code migrating less expensive and become popular in the development of parallel algorithms. Many researchers [13,14,15,16,17] have studied GPU computing on structured meshes, which involved coalesced computation technique [13], heterogeneous algorithm [15, 17], numerical methods [16], etc. Corrigan et al [18] investigated an Euler solver on GPU by employing unstructured grid and gained important factor of speedup over CPUs. en, a lot of results included data structure

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.