Abstract

General-purpose computing on graphics processing units (GPGPU), with programming models such as the Compute Unified Device Architecture (CUDA) by NVIDIA, offers the capability for accelerating the solution process of computational electromagnetics analysis. However, due to the communication-intensive nature of the finite-element algorithm, both the assembly and the solution phases cannot be implemented via fine-grained many-core GPU processors in a straightforward manner. In this paper, we identify the bottlenecks in the GPU parallelization of the Finite-Element Method for electromagnetic analysis, and propose potential solutions to alleviate the bottlenecks. We first discuss efficient parallelization strategies for the finite-element matrix assembly on a single GPU and on multiple GPUs. We then explore parallelization strategies for the finite-element matrix solution, in conjunction with parallelizable preconditioners to reduce the total solution time. We show that with a proper parallelization and implementation, GPUs are able to achieve significant speedups over OpenMP-enabled multi-core CPUs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.