Abstract

Compared with MPI, OpenMP provides us an easy way to parallelize the multilevel fast multipole algorithm (MLFMA) on shared-memory systems. However, the implementation of OpenMP parallelization has many pitfalls because difierent parts of MLFMA have distinct numerical characteristics due to its complicated algorithm structure. These pitfalls often cause very low e-ciency, especially when many threads are employed. Through an in-depth investigation on these pitfalls with analysis and numerical experiments, we propose an e-cient OpenMP parallel MLFMA. Two strategies are proposed in the parallelization, including: 1) loop reorganization for far-fleld interaction in the MLFMA; 2) determination of a transition level. Numerical experiments on large scale targets show the proposed OpenMP parallel scheme can perform as e-ciently as the MPI counterpart, and much more e-ciently than the straightforward OpenMP parallel one.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call