Abstract

Purification of the density matrix methods should be employed when dealing with complex chemical systems containing many atoms. The running times for these methods scale linearly with the number of atoms if we consider the sparsity from the density matrix. Since the efficiency expected from those methods is closely tied to the underlying parallel implementations of the linear algebra operations (e.g., P2 = P × P), we proposed a central processing unit (CPU) and graphics processing unit (GPU) parallel matrix-matrix multiplication in SVBR (symmetrical variable block row) format for energy calculations through the SP2 algorithm. This algorithm was inserted in MOPAC's MOZYME method, using the original LMO Fock matrix assembly, and the atomic integral calculation implemented on it. Correctness and performance tests show that the implemented SP2 is accurate and fast, as the GPU is able to achieve speedups up to 40 times for a water cluster system with 42,312 orbitals running in one NVIDIA K40 GPU card compared to the single-threaded version. The GPU-accelerated SP2 algorithm using the MOZYME LMO framework enables the calculations of semiempirical wavefunction with stricter SCF criteria for localized charged molecular systems, as well as the single-point energies of molecules with more than 100.000 LMO orbitals in less than 1h. Graphical abstract Parallel CPU and GPU purification algorithms for electronic structure calculations were implemented in MOPAC's MOZYME method. Some matrices in these calculations, e.g., electron density P, are compressed, and the developed linear algebra operations deal with non-zero entries only. We employed the NVIDIA/CUDA platform to develop GPU algorithms, and accelerations up to 40 times for larger systems were achieved.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call