Extended computational kernels in a massively parallel implementation of the Trotter–Suzuki approximation

Peter Wittek,Luca Calderaro

doi:10.1016/j.cpc.2015.07.017

Abstract

We extended a parallel and distributed implementation of the Trotter–Suzuki algorithm for simulating quantum systems to study a wider range of physical problems and to make the library easier to use. The new release allows periodic boundary conditions, many-body simulations of non-interacting particles, arbitrary stationary potential functions, and imaginary time evolution to approximate the ground state energy. The new release is more resilient to the computational environment: a wider range of compiler chains and more platforms are supported. To ease development, we provide a more extensive command-line interface, an application programming interface, and wrappers from high-level languages. New version program summaryProgram title: Trotter–Suzuki-MPICatalogue identifier: AEXL_v1_0Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEXL_v1_0.htmlProgram obtainable from: CPC Program Library, Queen’s University, Belfast, N. IrelandLicensing provisions: GNU General Public License, version 3No. of lines in distributed program, including test data, etc.: 14083No. of bytes in distributed program, including test data, etc.: 110023Distribution format: tar.gzProgramming language: C++, CUDA, Python, MATLAB.Computer: x86-64.Operating system: Linux.Has the code been vectorized or parallelized?: Yes. Number ofprocessors used: 1–64 in a single node, more in a cluster.RAM: 5 MByte-512 GBytesJournal reference of associated manuscript: Comput. Phys. Comm. 184(2013)1165Classification: 4.12.External routines: OpenMP, MPI, CUDADoes the new version supersede the previous version?: Yes. The original version is not held in the CPC Program Library but can be obtained from https://github.com/peterwittek/trotter–suzuki-mpiNature of problem: The evolution of a general quantum system is described by the time-dependent Schrödinger equation. The solution of this equation involves calculating a matrix exponential, which is formally simple, but computer implementations must consider several factors to achieve both high performance and high accuracy.Solution method: The Trotter–Suzuki approximation leads to an efficient algorithm for solving the time-dependent Schrödinger equation [1, 2]. The implementation uses high-performance parallel kernels in a distributed environment to maximize the computational power of this algorithm [3, 4].Reasons for new version: The computational kernels were generalized to be able to address a much wider range of physics problems. Furthermore, the code has been modularized to make development easier, providing both a command-line and an application programming interface. High-level wrappers from Python and MATLAB provide further ease of use.Summary of revisions: 1.The implementation was generalized to include a richer variety of physics problems. The problem can have periodic boundary conditions. Many-body simulations of non-interacting particles became a possible extension. We can define an arbitrary stationary potential function. The convenience function expect_values helps to obtain expectation values.2.Imaginary time evolution was implemented to find the ground state before starting the simulation. To avoid imposing the overhead of conditional branching in the most computationally intense parts of the code, some of the core kernel functions were duplicated to include the imaginary time evolution.3.Most of the functionality is exposed through a command-line interface (CLI) for convenience. This allows specifying the files of the initial state and the potential, the parameters of the Hamiltonian, and further parameters related to the simulation, such as the computational kernel to use and the frequency at which snapshots should be written to the disk.4.The full functionality of the implementation is exposed as an application programming interface (API) through the ’trotter’ function. This allows for integrating the simulation in a larger MPI programme and it is also useful for initializing the state and the potential without having files on the disk.To demonstrate the use of the API, several examples are provided with the code.5.To further ease development, we redesigned the structure of the implementation, making it more modular. We also introduced a unit testing framework to avoid regression.6.We improved the testing of MPI dependencies by the configure script and allowed compilation without MPI. We also improved the treatment of Intel and Visual C++ compilers.We developed wrappers for Python and MATLAB for the CPU kernel for a high-level interface with the library.Restrictions: The vectorized CPU kernel must have a tile width that is divisible by two. This puts a constraint on the possible matrix sizes for this kernel. For instance, running twelve MPI threads in a 4×3 configuration, the dimensions must be divisible by six and eight.Unusual features: The library currently only supports the CPU kernel under Windows. The Python and MATLAB wrappers support the CPU and SSE kernels.Additional comments: The high-performance kernels were independently extended to study spin dynamics [5]. It remains for future work to include lattice models in this implementation.Running time: The generalization slightly altered the memory access patterns of the computational kernels, yielding performance penalty of approximately 20% compared to the previous version (Table 1). The scaling properties did not change and we see a near-optimal scaling when increasing the number of nodes. The actual running time depends on the system size and the duration to be simulated, and the computational resources. It can range from a few seconds to several days.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Extended computational kernels in a massively parallel implementation of the Trotter–Suzuki approximation

Abstract

Talk to us

Similar Papers

More From: Computer Physics Communications

Lead the way for us

Journal: Computer Physics Communications	Publication Date: Aug 11, 2015
Citations: 8

Similar Papers

A parallel solver for huge dense linear systems
J.M Badia ... J Planelles
Computer Physics Communications | VOL. 182
J.M Badia, et. al.J.M Badia ... J Planelles
23 Jun 2011
Computer Physics Communications | VOL. 182

FIESTA 3: Cluster-parallelizable multiloop numerical calculations in physical regions
A.V Smirnov
Computer Physics Communications | VOL. 185
A.V SmirnovA.V Smirnov
25 Mar 2014
Computer Physics Communications | VOL. 185

QuTiP 2: A Python framework for the dynamics of open quantum systems
J.R Johansson ... Franco Nori
Computer Physics Communications | VOL. 184
J.R Johansson, et. al.J.R Johansson ... Franco Nori
13 Dec 2012
Computer Physics Communications | VOL. 184

Bill2d — A software package for classical two-dimensional Hamiltonian systems
J Solanpää ... E Räsänen
Computer Physics Communications | VOL. 199
J Solanpää, et. al.J Solanpää ... E Räsänen
26 Oct 2015
Computer Physics Communications | VOL. 199

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Extended computational kernels in a massively parallel implementation of the Trotter–Suzuki approximation

Abstract

Talk to us

Similar Papers

More From: Computer Physics Communications