We present a Graphics Processing Unit (GPU)-accelerated version of the real-space SPARC electronic structure code for performing Kohn-Sham density functional theory calculations within the local density and generalized gradient approximations. In particular, we develop a modular math-kernel based implementation for NVIDIA architectures wherein the computationally expensive operations are carried out on the GPUs, with the remainder of the workload retained on the central processing units (CPUs). Using representative bulk and slab examples, we show that relative to CPU-only execution, GPUs enable speedups of up to 6× and 60× in node and core hours, respectively, bringing time to solution down to less than 30s for a metallic system with over 14 000 electrons and enabling significant reductions in computational resources required for a given wall time.
Read full abstract