Abstract
The fast multipole method (FMM) has slowly become ubiquitous in its use to ameliorate both CPU and memory costs for integral equation (IE) solvers. This is reflected in a number of papers on this topic. However, a long-standing problem is effective parallelization of FMM based IE solvers. While techniques exist that have shown reasonable scalability up to 1024 processors, the dropoff beyond that is precipitous. This state of art stands in contrast to the rest of the computing community, where the focus is now on developing parallel solvers for exa-scale systems. In this paper, we focus on tearing down a scalable parallel FMM kernel to examine its innards with respect to scalability of each component. At the conference, we shall present a thorough analysis for multicore-clusters as well as propose solutions to bottlenecks.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have