This paper discusses two fast implementations of the conjugate gradient iterative method using a hierarchical multilevel preconditioner to solve the complex-valued, sparse systems obtained using the higher order finite-element method applied to the solution of the time-harmonic Maxwell equations. In the first implementation, denoted PCG-V, a classical V-cycle is applied and the system of equations on the lowest level is solved exactly. The second variant involves an approximate solution to the system of equations on the lowest level. To this end, auxiliary space preconditioning (ASP) is used instead of a direct solution. In this approach, denoted PCG-V-ASP, the time needed to solve the sparse system of equations is longer, but the memory requirements are smaller. To accelerate the computations, a graphics processing unit (GPU, Pascal P100) was used for both variants of the multilevel preconditioner. As a result, significant speedups were achieved over the reference parallel implementation using a multicore central processing unit (CPU, Intel Xeon E5-2680 v3, twelve cores). The results indicate that the auxiliary space preconditioning does in fact reduce the memory requirements, as compared with the reference PCG-V method, and at the same time performs each iteration faster. However, if symmetry is taken into account and the memory-efficient supernodal $LDL^{T}$ factorization is employed, the savings are less spectacular than anticipated based on previously published results using LU factorization and the multifrontal technique. PCG-V also requires a fewer iterations, so it’s time to solution is ultimately shorter. The difference is more pronounced if both preconditioners are run on a CPU. The use of a GPU as an accelerator for the computations considerably improves the performance of PCG-V-ASP over that of PCG-V.
Read full abstract