Abstract

There has been recent interest in the deployment of ab initio density matrix renormalization group (DMRG) computations on high performance computing platforms. Here, we introduce a reformulation of the conventional distributed memory ab initio DMRG algorithm that connects it to the conceptually simpler and advantageous sum of the sub-Hamiltonian approach. Starting from this framework, we further explore a hierarchy of parallelism strategies that includes (i) parallelism over the sum of sub-Hamiltonians, (ii) parallelism over sites, (iii) parallelism over normal and complementary operators, (iv) parallelism over symmetry sectors, and (v) parallelism within dense matrix multiplications. We describe how to reduce processor load imbalance and the communication cost of the algorithm to achieve higher efficiencies. We illustrate the performance of our new open-source implementation on a recent benchmark ground-state calculation of benzene in an orbital space of 108 orbitals and 30 electrons, with a bond dimension of up to 6000, and a model of the FeMo cofactor with 76 orbitals and 113 electrons. The observed parallel scaling from 448 to 2800 central processing unit cores is nearly ideal.

Highlights

  • The Density Matrix Renormalization Group (DMRG) algorithm[1,2] is established as a method to obtain highly accurate low-energy eigenstates of ab initio quantum chemistry Hamiltonians[3,4,5,6,7,8,9,10,11,12,13,14,15]

  • To the best our knowledge, there has not been an implementation that utilizes all 5 sources of parallelism in a scalable DMRG code for ab initio problems. This may be partly ascribed to the fact that strategies (iv) and (v) are most conveniently implemented in a DMRG code[39,40] that is structured using an Matrix Product Operator (MPO)/MPS formalism,[33,41] while many other efficient ab initio DMRG implementations[42,43] using strategies (i), (ii) and (iii) are organized around the construction and transformation of renormalized operators.[6]

  • We introduced a modification of the conventional strategy for distributed memory parallelism in ab initio DMRG algorithms that reduces the computation to the manipulation of independent subHamiltonians, together with a small wavefunction communication step

Read more

Summary

INTRODUCTION

The Density Matrix Renormalization Group (DMRG) algorithm[1,2] is established as a method to obtain highly accurate low-energy eigenstates of ab initio quantum chemistry Hamiltonians[3,4,5,6,7,8,9,10,11,12,13,14,15]. Brabec et al reported a non-spin-adapted massively parallel implementation of DMRG for quantum chemistry using strategies (ii) and (iii).[35] We note promising recent progress in GPU accelerated parallel DMRG algorithms.[36–38] to the best our knowledge, there has not been an implementation that utilizes all 5 sources of parallelism in a scalable DMRG code for ab initio problems This may be partly ascribed to the fact that strategies (iv) and (v) are most conveniently implemented in a DMRG code[39,40] that is structured using an MPO/MPS formalism,[33,41] while many other efficient ab initio DMRG implementations[42,43] using strategies (i), (ii) and (iii) are organized around the construction and transformation of renormalized operators.[6].

THEORY
Parallelism over renormalized operators
Parallelism over sub-Hamiltonians
Parallelism over sites
Shared memory parallelism over normal and complementary operators
Shared memory parallelism over symmetry sectors
RESULTS
Parallelism over dense matrix multiplication
Parallel Scaling
Findings
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call