PBBFMM3D: A parallel black-box algorithm for kernel matrix-vector multiplication

Ruoxi Wang,Chao Chen,Jonghyun Lee,Eric Darve

doi:10.1016/j.jpdc.2021.04.005

Abstract

Kernel matrix-vector product is ubiquitous in many science and engineering applications. However, a naive method requires O(N2) operations, which becomes prohibitive for large-scale problems. To reduce the computation cost, we introduce a parallel method that provably requires O(N) operations and delivers an approximate result within a prescribed tolerance. The distinct feature of our method is that it requires only the ability to evaluate the kernel function, offering a black-box interface to users. Our parallel approach targets multi-core shared-memory machines and is implemented using OpenMP. Numerical results demonstrate up to 19× speedup on 32 cores. We also present a real-world application in geo-statistics, where our parallel method was used to deliver fast principle component analysis of covariance matrices.

Full Text