Abstract

Kernel matrix-vector product is ubiquitous in many science and engineering applications. However, a naive method requires O(N2) operations, which becomes prohibitive for large-scale problems. To reduce the computation cost, we introduce a parallel method that provably requires O(N) operations and delivers an approximate result within a prescribed tolerance. The distinct feature of our method is that it requires only the ability to evaluate the kernel function, offering a black-box interface to users. Our parallel approach targets multi-core shared-memory machines and is implemented using OpenMP. Numerical results demonstrate up to 19× speedup on 32 cores. We also present a real-world application in geo-statistics, where our parallel method was used to deliver fast principle component analysis of covariance matrices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call