Abstract

Plane wave pseudopotential (PWP) density functional theory (DFT) calculation is the most widely used material science simulation, and the PWP DFT codes are arguably the most important material science codes. We have implemented a PWP DFT code PEtot on a multi-node GPU machine. Starting from a previous work, we have further improved the speed of the code, and achieved x13-x22 speedups over the CPU calculations for a typical 512 atom system. Such speedups are much higher than other similar works for this important class of material simulation codes on GPU clusters. The current achievement is obtained by (1) moving the calculation fully into the GPU; (2) adopting a new algorithm to reduce the data amount for MPI communication; and (3) using new GPU and CPU numerical libraries. We have also provided a detail quantitative analysis of the computational times for different physical systems and number of GPU units, which helps one to understand the challenges and bottlenecks of the PWP DFT simulations on GPU machines. Based on the analysis, we listed the machine and library requirements in order to further improve the performances of the PWP DFT calculations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call