Abstract

In this article, we present work to port the CASTEP first-principles materials modeling program to accelerators using open accelerator (OpenACC). We discuss the challenges and opportunities presented by graphical processing units (GPU) architectures in particular, and the approach taken in the CASTEP OpenACC port. Whilst the port is still under active development, early performance results show that significant speed-ups may be gained, particularly for materials simulations using so-called “nonlocal functionals,” where speed-ups can exceed a factor of ten.

Highlights

  • THE INTRODUCTION First-principles materials modelling is an invaluable tool for scientists to investigate the chemical, physical and electronic properties of matter, especially in the solidstate[2]

  • The construction and application of Vnlxc requires two fast Fourier transforms (FFTs) per pair of bands and kpoints; i.e. the total computational cost scales as Nb2Nk2N log N. This is a substantial increase in the computational workload compared to semilocal exchange-correlation, and usually means that the computational time is dominated by the FFTs even for relatively large simulations

  • Benchmark simulations of the new code showed that the refactoring had improved the Graphical Processing Units (GPU) performance substantially, further reducing the iteration time by over 40% to 45 seconds; the overall speed-up was over x14 compared to the baseline CPU-only simulations

Read more

Summary

Published by the IEEE Computer Society

Direct diagonalisation yields all Np eigenstates, which is typically two orders of magnitude more than the ∼ N required to model the behaviour of the N particles For these reasons, when using a plane-wave basis set, it is common to use an iterative diagonalisation method, which allows the calculation to be restricted to the ∼ N eigenstates of interest. Almost all of these methods proceed by repeated application of a matrix to the set of trial states and, in the present context, do not require the construction and storage of the Hamiltonian matrix explicitly, only the ability for it (and related matrices) to be applied to trial eigenvectors. The advent of exascale computing, in particular the move towards heterogeneous computing, presents a significant challenge

Towards exascale computing with accelerators
The CASTEP GPU port
Challenges and further work
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call