Abstract

To optimize the geometry of airfoils for a specific application is an important engineering problem. In this context genetic algorithms have enjoyed some success as they are able to explore the search space without getting stuck in local optima. However, these algorithms require the computation of aerodynamic properties for a significant number of airfoil geometries. Consequently, for low-speed aerodynamics, panel methods are most often used as the inner solver. In this paper we evaluate the performance of such an optimization algorithm on modern accelerators (more specifically, the Intel Xeon Phi 7120 and the NVIDIA K80). For that purpose, we have implemented an optimized version of the algorithm on the CPU and Xeon Phi (based on OpenMP, vectorization, and the Intel MKL library) and on the GPU (based on CUDA and the MAGMA library). We present timing results for all codes and discuss the similarities and differences between the three implementations. Overall, we observe a speedup of approximately 2.5 for adding an Intel Xeon Phi 7120 to a dual socket workstation and a speedup between 3.4 and 3.8 for adding a NVIDIA K80 to a dual socket workstation.

Highlights

  • Numerical simulations are routinely used in applications to predict the properties of fluid flow over a solid geometry

  • We have used the vectorization report of the Intel C compiler to check that the compiler has sufficient information to vectorize the time intensive portions of our algorithm. This has to be contrasted with the graphic processing units (GPUs) implementation of the assembly step which is relatively straightforward

  • We remark that the speedup compared to the single GPU implementation is in all cases within 5% of the maximal achievable speedup

Read more

Summary

Introduction

Numerical simulations are routinely used in applications to predict the properties of fluid flow over a solid geometry. The main advantage of panel methods is that they are computationally cheap and that fact makes them ideally suited as the inner solver in an optimization algorithm They are able to faithfully reproduce the relevant aerodynamic quantities for low-speed aerodynamics [8]. The described optimization problem lends itself well to parallelization As such it can potentially profit significantly from accelerators such as graphic processing units (GPUs) or the Intel Xeon Phi. some papers have been published that implement panel methods on GPUs (see, for example, the work conducted in [9,10,11,12]). The purpose of the present work is to parallelize the optimization problem described above (of which the panel method is the computationally most demanding part) on both traditional CPU based systems as well as on the GPU and to compare their performance.

Numerical algorithm
Computational considerations
GPU implementation
Intel Xeon Phi implementation
Multiple GPU implementation
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.