Abstract

This chapter shows how the electronic structure can be propagated without diagonalization in the LvNMD approach. It also demonstrates how to integrate GPU code written in C/CUDA with the main Fortran program. The implementation combines Fortran with nonthunking CUBLAS (matrix multiplications) and direct CUDA kernels. The physical model for the direct molecular dynamics described here contains nuclei and electrons. In direct molecular dynamics, the positions of nuclei are changing with time and the electronic structure adjusts following the time-dependent Schroedinger equation (TDSE). Time dependence of electronic structure is often neglected, and instead, the time-independent Schroedinger equation is solved. This leads to the so-called Born–Oppenheimer approximation. The algorithm based on diagonalization and which also maps poorly to GPU architecture can be replaced by an alternative approach that is based on matrix–matrix multiplication that maps well to GPU architectures. The current implementation combines Fortran with C and CUDA. The main program and initialization of the dynamics is written in Fortran. For each time step of MD (Molecular Dynamics) the Fock matrix is evaluated on the CPU and transferred to GPU memory. It is important to analyze the theoretical scaling of the algorithm and of its major components and the computation of various building blocks for the target problem size. Focus must be on the most expensive part (building block) of the algorithm. Use existing efficient libraries if possible. It is generally cheaper and more effective to use existing libraries than to optimize the code by hand. If the most expensive building blocks do not exist yet in the form of the efficient library and if no GPU library exists, then focus on the optimization of CUDA code.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call