High Performance Computing of Complex Electromagnetic Algorithms Based on GPU/CPU Heterogeneous Platform and Its Applications to EM Scattering and Multilayered Medium Structure

Zhe Song,Hou-Xing Zhou,Xing Mu

doi:10.1155/2017/9173062

Abstract

The fast and accurate numerical analysis for large-scale objects and complex structures is essential to electromagnetic simulation and design. Comparing to the exploration in EM algorithms from mathematical point of view, the computer programming realization is coordinately significant while keeping up with the development of hardware architectures. Unlike the previous parallel algorithms or those implemented by means of parallel programming on multicore CPU with OpenMP or on a cluster of computers with MPI, the new type of large-scale parallel processor based on graphics processing unit (GPU) has shown impressive ability in various scenarios of supercomputing, while its application in computational electromagnetics is especially expected. This paper introduces our recent work on high performance computing based on GPU/CPU heterogeneous platform and its application to EM scattering problems and planar multilayered medium structure, including a novel realization of OpenMP-CUDA-MLFMM, a developed ACA method and a deeply optimized CG-FFT method. With fruitful numerical examples and their obvious enhancement in efficiencies, it is convincing to keep on deeply investigating and understanding the computer hardware and their operating mechanism in the future.

Highlights

Demand boosting in high performance computing algorithms has been one of the most significant topics in computational electromagnetics (CEM)
Our recent works on high performance computing based on graphics processing unit (GPU)/CPU heterogeneous platform are introduced with the following: (1) A novel realization of OpenMP-CUDA-multilevel fast multipole method (MLFMM) method for EM scattering problem, in which the nearfield matrix filling and the sparse matrix-vector production (MVP) are optimized, together with a warplevel parallel scheme for aggregation/disaggregation and the use of texture memory for 2D local interpolation/anterpolation
Our recent works on high performance computing based on GPU/CPU heterogeneous platform and its application to EM scattering problems and planar multilayered medium structures are introduced

Summary

Introduction

Demand boosting in high performance computing algorithms has been one of the most significant topics in computational electromagnetics (CEM). All the RWG functions are stored as two copies, namely, testing chain and basis chain, and each thread on GPU deals with one or more nonzero matrix elements by accessing corresponding data from the chains. This scheme can extend to multi-GPU case by splitting the whole task into subtasks. An optimized parallelization can be realized by the socalled “single instruction multiple thread (SIMT)” [14] The flowchart of this optimized CG-FFT on GPU is depicted, in which the “RHT” stands for right-hand term in MoM matrix equation. The right-hand term of the MoM matrix equation will be different from radiating, transmitting to scattering cases

Numerical Examples

2.50 NVIDIA GK104

Conclusions

Conflicts of Interest