In this article, a graphic processing unit (GPU)-based acceleration implementation of a hybrid discontinuous Galerkin time domain method based on Maxwell’s equations and Helmholtz vector wave equation (HDGTD) has been developed. The computational domain is discretized by tetrahedrons and the resultant meshes are categorized into two regions solved by Maxwell’s equations and Helmholtz vector wave equation, respectively. The hierarchical vector basis functions are used to expand the unknowns in the HDGTD method, and a universal matrix technique is proposed to decompose the geometry-dependent matrices in each tetrahedron into the summation of universal matrices defined in barycentric coordinates, thus giving rise to a great decrease of the memory usage. A local time stepping (LTS) method based on a simple interpolation is introduced in the proposed HDGTD method to achieve highly efficient solution of multiscale problems. Two kinds of compute unified device architecture (CUDA)-based mapping techniques, i.e., 1-D and 2-D blocks, are implemented to achieve a tradeoff between the parallel speedup and the memory usage. With the 1-D block mapping, over 590 times speedup can be achieved, and in the case of the 2-D block mapping, over 150 times acceleration and 13 times memory reduction are obtained. Some practical complex examples are given to demonstrate a good performance of the proposed parallel method.
Read full abstract