OpenMP Directives Research Articles

SUMMARYAn efficient parallel spectral method for direct numerical simulations of transitional and turbulent flows is described in this paper. The parallelization is classically based on a bidimensional domain decomposition, but has been specifically developed for a solenoidal Fourier–Chebyshev spectral approximation where in one Fourier direction, the number of modes is very large compared with the two other directions. The approach therefore differs from classical libraries developed for cubic Fourier boxes. The strategy uses message‐passing interface (MPI) for message‐passing among nodes and is fairly portable. One of the originalities of this paper is the use of an efficient hybrid programming with MPI for internodes communications and a coarse grain parallelism using OpenMP for core shared‐memory computation, instead of the classical hybrid programming with MPI and a fine granularity parallelism at the loop level with OpenMP directives. This hybrid parallelism has been tested on the recent generation of high‐performance parallel supercomputers involving a few tens of cores per node. Performances are evaluated on different low‐frequency and high‐frequency processors massively parallel platforms. We demonstrate that spectral methods, which are known to be inherently ill‐fitted for the new generation of high‐performance distributed‐memory computers, can be implemented efficiently using this hybrid programming with good scalability and a very fast wall‐clock time per iteration. New numerical experiments are therefore now accessible on petascale computers, while keeping the attractive features of spectral methods such as accuracy, exponential convergence, computational efficiency and conservative properties. This is illustrated by a direct numerical simulation of the transition of the boundary layers developing from the entrance section of a plane channel and interacting to merge into a fully turbulent flow. Copyright © 2012 John Wiley & Sons, Ltd.

Read full abstract

The finite-difference time-domain method (FDTD) allows electromagnetic field distribution analysis as a function of time and space. The method is applied to analyze holographic volume gratings (HVGs) for the near-field distribution at optical wavelengths. Usually, this application requires the simulation of wide areas, which implies more memory and time processing. In this work, we propose a specific implementation of the FDTD method including several add-ons for a precise simulation of optical diffractive elements. Values in the near-field region are computed considering the illumination of the grating by means of a plane wave for different angles of incidence and including absorbing boundaries as well. We compare the results obtained by FDTD with those obtained using a matrix method (MM) applied to diffraction gratings. In addition, we have developed two optimized versions of the algorithm, for both CPU and GPU, in order to analyze the improvement of using the new NVIDIA Fermi GPU architecture versus highly tuned multi-core CPU as a function of the size simulation. In particular, the optimized CPU implementation takes advantage of the arithmetic and data transfer streaming SIMD (single instruction multiple data) extensions (SSE) included explicitly in the code and also of multi-threading by means of OpenMP directives. A good agreement between the results obtained using both FDTD and MM methods is obtained, thus validating our methodology. Moreover, the performance of the GPU is compared to the SSE+OpenMP CPU implementation, and it is quantitatively determined that a highly optimized CPU program can be competitive for a wider range of simulation sizes, whereas GPU computing becomes more powerful for large-scale simulations.

Read full abstract

OpenMP Directives Research Articles

Related Topics

Articles published on OpenMP Directives

Performance of a Code Migration for the Simulation of Supersonic Ejector Flow to SMP, MIC, and GPU Using OpenMP, OpenMP+LEO, and OpenACC Directives

Overhead Analysis of Loop Parallelization with OpenMP Directives

NDL-v2.0: A new version of the numerical differentiation library for parallel architectures

Simulating the filtration combustion of gases on multi-core computers

Hybrid parallelism in MFIX CFD-DEM using OpenMP

Exploration on a Fast EHL Computing Technology for Analyzing Journal Bearings with Engineered Surface Textures

Performance analysis of SSE and AVX instructions in multi-core CPUs and GPU computing on FDTD scheme for solid and fluid vibration problems

Modeling of the turbulent mixing on basis of the large eddy simulation by using parallel computing

Comparing high performance techniques for the automatic generation of efficient solvers of cardiac cell models

Asynchronous Approach to Memory Management in Sparse Multifrontal Methods on Multiprocessors

Automated Derivation of the Adjoint of High-Level Transient Finite Element Programs

Towards petascale spectral simulations for transition analysis in wall bounded flow

Parallel simulation of Brownian dynamics on shared memory systems with OpenMP and Unified Parallel C

CorrelaGenes: a new tool for the interpretation of the human transcriptome

Performance analysis of the FDTD method applied to holographic volume gratings: Multi-core CPU versus GPU computing

Parallel computation of satellite orbit acceleration

Automatic Parallelization of Array-oriented Programs for a Multi-core Machine

Workstation Computing of Discretized Reynolds Equations

Low-energy electron collisions with glycine

광선추적법 기반의 적분구 분석 시뮬레이터에서 OpenMP 지시어를 이용한 속도 향상 및 몬테카를로 방법의 무작위성 보장

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

OpenMP Directives Research Articles

Related Topics

Articles published on OpenMP Directives

Performance of a Code Migration for the Simulation of Supersonic Ejector Flow to SMP, MIC, and GPU Using OpenMP, OpenMP+LEO, and OpenACC Directives

Overhead Analysis of Loop Parallelization with OpenMP Directives

NDL-v2.0: A new version of the numerical differentiation library for parallel architectures

Simulating the filtration combustion of gases on multi-core computers

Hybrid parallelism in MFIX CFD-DEM using OpenMP

Exploration on a Fast EHL Computing Technology for Analyzing Journal Bearings with Engineered Surface Textures

Performance analysis of SSE and AVX instructions in multi-core CPUs and GPU computing on FDTD scheme for solid and fluid vibration problems

Modeling of the turbulent mixing on basis of the large eddy simulation by using parallel computing

Comparing high performance techniques for the automatic generation of efficient solvers of cardiac cell models

Asynchronous Approach to Memory Management in Sparse Multifrontal Methods on Multiprocessors

Automated Derivation of the Adjoint of High-Level Transient Finite Element Programs

Towards petascale spectral simulations for transition analysis in wall bounded flow

Parallel simulation of Brownian dynamics on shared memory systems with OpenMP and Unified Parallel C

CorrelaGenes: a new tool for the interpretation of the human transcriptome

Performance analysis of the FDTD method applied to holographic volume gratings: Multi-core CPU versus GPU computing

Parallel computation of satellite orbit acceleration

Automatic Parallelization of Array-oriented Programs for a Multi-core Machine

Workstation Computing of Discretized Reynolds Equations

Low-energy electron collisions with glycine

광선추적법 기반의 적분구 분석 시뮬레이터에서 OpenMP 지시어를 이용한 속도 향상 및 몬테카를로 방법의 무작위성 보장