ABSTRACTThe Boys function, a mathematical integral function, plays a pivotal role and is frequently evaluated in ab initio molecular orbital computations. The main contribution of this paper is to accelerate the bulk evaluation of the Boys function through the effective utilization of GPUs. The proposed GPU implementation addresses GPU‐specific programming issues such as warp divergence and coalesced/stride access to global memory, and we employ the optimal numerical evaluation method from four methods based on input values to ensure efficient computation with sufficient accuracy. Moreover, to consider actual computation of molecular integrals, we have implemented and evaluated the proposed method in two scenarios: single evaluation, which computes a single value of the Boys function for a single input, and incremental evaluation, which computes multiple values of the Boys function incrementally. The execution time of the proposed GPU implementation was evaluated for both scenarios using an NVIDIA A100 Tensor Core GPU. As a result, the GPU‐accelerated bulk evaluation has achieved a throughput of computing the values of the Boys function times per second for the single evaluation and times per second for the incremental evaluation, respectively. Our parallelized CPU and GPU implementation is available at https://github.com/sstsuji/Boys‐function‐GPU‐library.
Read full abstract