Efficient computation of Coulomb and exchange integrals for multi-million atom nanostructures

Piotr T Różański,Michał Zieliński

doi:10.1016/j.cpc.2018.12.011

Abstract

Atomistic modeling of nanostructures such as quantum dots or nanowires often involves numbers of atoms reaching and even exceeding well beyond 1 million. Such a large quantity of atoms presents a very complex computational challenge especially at a stage of many-body calculation where numerous Coulomb matrix elements need to be calculated. Here we present a practical solution to this problem by performing calculations in the momentum space and utilizing fast Fourier transform combined with a memory-efficient way to compute the convolution that overcomes the problem of spurious interactions with quasi-charges from other super-cells. Finally, calculation of multiple integrals is optimized by reducing the problem to finding a minimum vertex cover of a graph. All these algorithms are implemented and presented here in a self-contained and highly parallelized computer program named Coulombo. Coulombo demonstrates quasi-linear scaling of computational time of Coulomb matrix elements with respect to the number of points in the computational box and, at the same time, significantly reduced memory demand. The proposed solution can have potential applications not only in the realm of nano-physics, but could be applied to other mesoscopic simulations or large-scale quantum chemistry problems. Program summaryProgram Title: CoulomboProgram Files doi:http://dx.doi.org/10.17632/98bhm5zbrd.1Licensing provisions: CC BY 4.0Programming language: C++Nature of problem: Computing the Coulomb matrix elements (Coulomb and exchange integrals), while being a demanding computational task, is a necessary step in a range of quantum mechanical calculations. For example, in the field of nanostructures, such as quantum dots and nanowires, this stage of calculation even after numerous approximations is at least an O(N2) operation (a summation over all pairs of N atoms or grid points in the analyzed system). Moreover, calculating the full Coulomb matrix usually requires computation of thousands of such elements, thus presenting a formidable computational challenge.Solution method: We provide a ready-to-use implementation for calculating Coulomb matrix elements for a given set of input wavefunctions. This implementation is based on the approach introduced in [1], by using a fast Fourier transform to compute the convolution. In this work we further significantly improved the method by eliminating the need to extend the computational domain with padding, thus reducing the memory consumption by a factor of 8. The optimal computational plan for each run is prepared by calculating a minimum vertex cover on a graph representing the subset of requested Coulomb matrix elements.Additional comments including restrictions and unusual features: The implementation is fully parallelized in a distributed-memory model, using MPI and parallel routines from FFTW [2]. A minimum vertex cover is computed by our greedy approximation algorithm, which we found to perform significantly better than the standard text-book heuristic.

Full Text