Abstract

The parallel higher-order method of moments (HoMoM) with a GPU accelerated out-of-core LU solver is presented for analysis of radiation characteristics of a 1000-element antenna array over a full-size airplane. A parallel framework involving MPI and CUDA is adopted to ensure that the procedures run on a hybrid CPU/GPU cluster. An efficient two-level out-of-core scheme is designed to break the bottleneck of both GPU memory and physical memory when solving electrically large and complex problems. To hide communication time between CPU and GPU, asynchronous communications are chosen to enable overlapping between communication and computation. For large problems that cannot fit in GPU memory or physical memory, the two-level out-of-core LU solver is able to achieve a speedup of about 1.6x over the traditional out-of-core LU solver based on a highly optimized math library.

Highlights

  • The method of moments (MoM) can provide highly accurate results for a wide variety of complex electromagnetic (EM) problems [1, 2]

  • One is the fast algorithms based on MoM, for example, the fast multipole method (FMM) [3, 4], multilevel fast multipole algorithm (MLFMA) [5, 6], and adaptive integral method (AIM) [7]

  • The computational platform used in the following examples is a hybrid CPU/graphics processing units (GPUs) cluster, which is equipped with 2 compute nodes connected by 1000 Mbps network cards

Read more

Summary

Introduction

The method of moments (MoM) can provide highly accurate results for a wide variety of complex electromagnetic (EM) problems [1, 2] It requires a lot of memory and calculation time when solving large dense matrix equations using the lower/upper (LU) decomposition based direct solver. One is the fast algorithms based on MoM, for example, the fast multipole method (FMM) [3, 4], multilevel fast multipole algorithm (MLFMA) [5, 6], and adaptive integral method (AIM) [7] Speaking, these approaches reduce the memory requirement and the computation complexity compared with MoM and get more accurate results compared with the hybridization of MoM with high frequency methods [8, 9]. An efficient GPU-based out-of-core parallel LU solver of the HoMoM using message passing interface (MPI), which runs on a high performance cluster with multiple CPU/GPU computing nodes, is presented to solve the complex EM problems. The proposed solver has two distinguishing characteristics as follows: (1) an efficient two-level out-of-core scheme is presented to overcome the restriction of the RAM and GPU memory; (2) an overlapping scheme based on the asynchronous data transfer technique and CUDA streams is adopted to hide the communication time between CPUs and GPU cards

Higher-Order Method of Moments
GPU-Based Out-of-Core LU Solver
A52 A55 A53
Numerical Results and Discussion
24 CPU cores 24 CPU cores and 2 GPUs
24 CPU cores and 2 GPUs
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call