Abstract

General purpose graphical processing units (GPUʼs) offer high processing speeds for certain classes of highly parallelizable computations, such as matrix operations and Fourier transforms, that lie at the heart of first-principles electronic structure calculations. Inclusion of exact-exchange increases the cost of density functional theory by orders of magnitude, motivating the use of GPUʼs. Porting the widely used electronic density functional code VASP to run on a GPU results in a 5–20 fold performance boost of exact-exchange compared with a traditional CPU. We analyze performance bottlenecks and discuss classes of problems that will benefit from the GPU. As an illustration of the capabilities of this implementation, we calculate the lattice stability α- and β-rhombohedral boron structures utilizing exact-exchange. Our results confirm the energetic preference for symmetry-breaking partial occupation of the β-rhombohedral structure at low temperatures, but does not resolve the stability of α relative to β.

Highlights

  • First principles quantum mechanical calculations of total energy are among the most pervasive and demanding applications of supercomputers

  • Our study focuses on four specific structures of interest: the 12-atom α-rhombohedral structure with Pearson type hR12; a 96-atom supercell of α, doubled along each axis in order to match the lattice parameters of β, that we denote hR12x8; the ideal 105-atom β-rhombohedral structure with Pearson type hR105; and the symmetrybroken 107-atom β′ variant that optimizes the generalized gradient approximation (GGA) total energy that has Pearson type aP107

  • Behaves like a ‘fat’ node in that the graphical processing units (GPU’s) compute power is greatly increased but the inter-process communication stays fixed. This leads the graphical processing units (GPU) to compare favorably to many-CPUs on small structures, as seen in the second row of Table V labeled hR105 k=2, where we see that 2 CPU cores plus 2 GPUs perform equivalently to the fastest time on the supercomputer, which occurs for 64 cores

Read more

Summary

INTRODUCTION

First principles quantum mechanical calculations of total energy are among the most pervasive and demanding applications of supercomputers. For example, with its highly complex crystal structure containing approximately 107 atoms, lies well beyond the limits of exact energy calculation. Replacing the many-body problem with Ne coupled single electron problems reduces the exponential dependence to a polynomial, at the cost of introducing approximations. Both Hartree-Fock (HF) and electronic density functional theory (DFT) are formally O(Ne4) in complexity [2]. Massively parallel processing became available for low-cost computer systems through the introduction of general purpose graphical processing units (GPU). These systems can contain hundreds of cores, with a low cost and low power consumption per core. Our key results are: 1) exact-exchange calculations confirm that the symmetric β structure is energetically unfavorable relative to α, but its energy can be reduced through symmetry breaking partial site occupancy. 2) The GPU system outperforms the CPU, with speedups reaching a factor of 20 in computationally demanding cases

ELEMENTAL BORON
Structural stability
Setup of DFT calculations
PORTING OF VASP
CPU Performance Analysis
Our Implementation
Performance Results
Findings
CONCLUSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call