OpenMP Fortran and C programs for solving the time-dependent Gross–Pitaevskii equation in an anisotropic trap

Paulsamy Muruganandam,Dušan Vudragović,Antun Balaž,Luis E. Young-S.,Sadhan K. Adhikari

doi:10.1016/j.cpc.2016.03.015

Abstract

We present new version of previously published Fortran and C programs for solving the Gross–Pitaevskii equation for a Bose–Einstein condensate with contact interaction in one, two and three spatial dimensions in imaginary and real time, yielding both stationary and non-stationary solutions. To reduce the execution time on multicore processors, new versions of parallelized programs are developed using Open Multi-Processing (OpenMP) interface. The input in the previous versions of programs was the mathematical quantity nonlinearity for dimensionless form of Gross–Pitaevskii equation, whereas in the present programs the inputs are quantities of experimental interest, such as, number of atoms, scattering length, oscillator length for the trap, etc. New output files for some integrated one- and two-dimensional densities of experimental interest are given. We also present speedup test results for the new programs. New version program summaryProgram title: BEC-GP-OMP package, consisting of: (i) imag1d, (ii) imag2d, (iii) imag3d, (iv) imagaxi, (v) imagcir, (vi) imagsph, (vii) real1d, (viii) real2d, (ix) real3d, (x) realaxi, (xi) realcir, (xii) realsph.Catalogue identifier: AEDU_v4_0.Program Summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDU_v4_0.htmlProgram obtainable from: CPC Program Library, Queen’s University of Belfast, N. Ireland.Licensing provisions: Apache License 2.0No. of lines in distributed program, including test data, etc.: 130308.No. of bytes in distributed program, including test data, etc.: 929062.Distribution format: tar.gz.Programming language: OpenMP C; OpenMP Fortran.Computer: Any multi-core personal computer or workstation.Operating system: Linux and Windows.RAM: 1 GB.Number of processors used: All available CPU cores on the executing computer.Classification: 2.9, 4.3, 4.12.Catalogue identifier of previous version: AEDU_v1_0, AEDU_v2_0.Journal reference of previous version: Comput. Phys. Commun. 180 (2009) 1888; ibid. 183 (2012) 2021.Does the new version supersede the previous version?: No. It does supersedes versions AEDU_v1_0 and AEDU_v2_0, but not AEDU_v3_0, which is MPI-parallelized version.Nature of problem: The present OpenMP Fortran and C programs solve the time-dependent nonlinear partial differential Gross–Pitaevskii (GP) equation for a Bose–Einstein condensate in one (1D), two (2D), and three (3D) spatial dimensions in a harmonic trap with six different symmetries: axial- and radial-symmetry in 3D, circular-symmetry in 2D, and fully anisotropic in 2D and 3D.Solution method: The time-dependent GP equation is solved by the split-step Crank–Nicolson method by discretizing in space and time. The discretized equation is then solved by propagation, in either imaginary or real time, over small time steps. The method yields the solution of stationary and/or non-stationary problems.Reasons for the new version: Previously published Fortran and C programs [1,2] for solving the GP equation are recently enjoying frequent usage [3] and application to a more complex scenario of dipolar atoms [4]. They are also further extended to make use of general purpose graphics processing units (GPGPU) with Nvidia CUDA [5], as well as computer clusters using Message Passing Interface (MPI) [6]. However, a vast majority of users use single-computer programs, with which the solution of a realistic dynamical 1D problem, not to mention the more complicated 2D and 3D problems, could be time consuming. Now practically all computers have multicore processors, ranging from 2 up to 18 and more CPU cores. Some computers include motherboards with more than one physical CPU, further increasing the possible number of available CPU cores on a single computer to several tens. The present programs are parallelized using OpenMP over all the CPU cores and can significantly reduce the execution time. Furthermore, in the old version of the programs [1,2] the inputs were based on the mathematical quantity nonlinearity for the dimensionless form of the GP equation. The inputs for the present versions of programs are given in terms of phenomenological variables of experimental interest, as in Refs. [4,5], i.e., number of atoms, scattering length, harmonic oscillator length of the confining trap, etc. Also, the output files are given names which make identification of their contents easier, as in Refs. [4,5]. In addition, new output files for integrated densities of experimental interest are provided, and all programs were thoroughly revised to eliminate redundancies.Summary of revisions: Previous Fortran [1] and C [2] programs for the solution of time-dependent GP equation in 1D, 2D, and 3D with different trap symmetries have been modified to achieve two goals. First, they are parallelized using OpenMP interface to reduce the execution time in multicore processors. Previous C programs [2] had OpenMP-parallelized versions of 2D and 3D programs, together with the serial versions, while here all programs are OpenMP-parallelized. Secondly, the programs now have input and output files with quantities of phenomenological interest. There are six trap symmetries and both in C and in Fortran there are twelve programs, six for imaginary-time propagation and six for real-time propagation, totaling to 24 programs. In 3D, we consider full radial symmetry, axial symmetry and full anisotropy. In 2D, we consider circular symmetry and full anisotropy. The structure of all programs is similar.For the Fortran programs the input data (number of atoms, scattering length, harmonic oscillator trap length, trap anisotropy, etc.) are conveniently placed at the beginning of each program. For the C programs the input data are placed in separate input files, examples of which can be found in a directory named input. The examples of output files for both Fortran and C programs are placed in the corresponding directories called output. The programs then calculate the dimensionless nonlinearities actually used in the calculation. The provided programs use physical input parameters that give identical nonlinearity values as the previously published programs [1,2], so that the output files of the old and new programs can be directly compared. The output files are conveniently named so that their contents can be easily identified, following Refs. [4,5]. For example, file named <code>-out.txt, where <code> is a name of the individual program, is the general output file containing input data, time and space steps, nonlinearity, energy and chemical potential, and was named fort.7 in the old Fortran version. The file <code>-den.txt is the output file with the condensate density, which had the names fort.3 and fort.4 in the old Fortran version for imaginary- and real-time propagation, respectively. Other density outputs, such as the initial density, are commented out to have a simpler set of output files. The users can re-introduce those by taking out the comment symbols, if needed. Table 1Wall-clock execution times (in seconds) for runs with 1, 6 and 20 CPU cores with different programs using the Intel Fortran ifort (F-1, F-6 and F-20, respectively) and Intel C icc (C-1, C-6 and C-20, respectively) compilers using a workstation with two Intel Xeon E5-2650 v3 CPUs, with a total of 20 CPU cores, and obtained speedups (speedup-F=F-1/F-20, -speedupC=C-1/C-20 ) for 20 CPU cores.F-1F-6F-20speedup-FC-1C-6C-20speedup-Cimag1d3226261.24528271.7imagcir1515151.02115151.4imagsph1212121.01912101.9real1d19484722.7304110983.1realcir13262572.318278642.8realsph11968671.819176613.1imag2d19066523.7394773311.9imagaxi24074564.3499113559.1real2d26970475.7483963513.8realaxi13237255.3237512210.8imag3d16824723664.6249054520212.3real3d15,479349420827.422,2284558143815.5 [Display omitted] Also, some new output files are introduced in this version of programs. The files <code>-rms.txt are the output files with values of root-mean-square (rms) sizes in the multi-variable cases. There are new files with integrated densities, such as imag2d-den1d_x.txt, where the first part (imag2d) denotes that the density was calculated with the 2D program imag2d, and the second part (den1d_x) stands for the 1D density in the x-direction, obtained after integrating out the 2D density |ϕ(x,y)|2 in the x–y plane over y-coordinate,(1)n1D(x)=∫−∞∞dy|ϕ(x,y)|2. Similarly, imag3d-den1d_x.txt and real3d-den1d_x.txt represent 1D densities from a 3D calculation obtained after integrating out the 3D density |ϕ(x,y,z)|2 over y- and z-coordinate. The files imag3d-den2d_xy.txt and real3d-den2d_xy.txt are the integrated 2D densities in the x–y plane from a 3D calculation obtained after integrating out the 3D density over the z-coordinate, and similarly for other output files. Again, calculation and saving of these integrated densities is commented out in the programs, and can be activated by the user, if needed.In real-time propagation programs there are additional results for the dynamics saved in files, such as real2d-dyna.txt, where the first column denotes time, the second, third and fourth columns display rms sizes for the x-, y-, and r-coordinate, respectively. The dynamics is generated by multiplying the nonlinearity with a pre-defined factor during the NRUN iterations, and starting with the wave function calculated during the NPAS iterations. Such files were named fort.8 in the old Fortran versions of programs. There are similar files in the 3D real-time programs as well.Often it is needed to get a precise stationary state solution by imaginary-time propagation and then use it in the study of dynamics using real-time propagation. For that purpose, if the integer number NSTP is set to zero in real-time propagation, the density obtained in the imaginary-time simulation is used as initial wave function for real-time propagation, as in Refs. [4,5]. In addition, at the end of output files <code>-out.txt, we have introduced two new outputs, wall-clock execution time and CPU time for each run.We tested our programs on a workstation with two 10-core Intel Xeon E5-2650 v3 CPUs, and present results for all programs compiled with the Intel compiler. In Table 1 we show different wall-clock execution times for runs on 1, 6 and 20 CPU cores for Fortran and C. The corresponding columns “speedup-F” and “speedup-C” give the ratio of wall-clock execution times of runs on 1 and 20 CPU cores, and denote the actual measured speedup for 20 CPU cores. For the programs with effectively one spatial variable, the Fortran programs turn out to be quicker for small number of cores, whereas for larger number of CPU cores and for the programs with three spatial variables the C programs are faster. We also studied the speedup of the programs as a function of the number of available CPU cores. The performance for the imag3d Fortran and C programs is illustrated in Fig. 1(a) and (b), where we plot the speedup and actual wall-clock time of the imag3d C and Fortran programs as a function of number of CPU cores in a workstation with two Intel Xeon E5-2650 v3 CPUs, with a total of 20 CPU cores. The plot in Fig. 1(a) shows that the C program parallelizes more efficiently than the Fortran program. However, as the wall-clock time in Fortran for a single CPU core is less than that in C, the wall-clock times in both cases are comparable, viz. Fig. 1(b). A saturation of the speedup with the increase of the number of CPU cores is expected in all cases. However, the saturation is attained quicker in Fortran than in C programs, and therefore the use of C programs could be recommended for larger number of CPU cores. For a small number of CPU cores the Fortran programs should be preferable. For example, from Table 1 we see that for 6 CPU cores the Fortran programs are faster than the C programs. In Fig. 1(a) the saturation of the speedup of the Fortran program is achieved for approximately 10 CPU cores, when the wall-clock time of the C program crosses that of the Fortran program.Additional comments:This package consists of 24 programs, see Program title above. For the particular purpose of each program, please see descriptions below.Running time:Example inputs provided with the programs take less than 30 min in a workstation with two Intel Xeon Processors E5-2650 v3, 2 QPI links, 10 CPU cores (25 MB cache, 2.3 GHz).Program summary (i), (v), (vi), (vii), (xi), (xii)Program title: imag1d, imagcir, imagsph, real1d, realcir, realsph.Title of electronic files in C: (imag1d.c and imag1d.h), (imagcir.c and imagcir.h), (imagsph.c and imagsph.h), (real1d.c and real1d.h), (realcir.c and realcir.h), (realsph.c and realsph.h).Title of electronic files in Fortran 90: imag1d.f90, imagcir.f90, imagsph.f90, real1d.f90, realcir.f90, realsph.f90.Maximum RAM memory: 1 GB for the supplied programs.Programming language used: OpenMP C and Fortran 90.Typical running time: Minutes on a modern four-core PC.Nature of physical problem: These programs are designed to solve the time-dependent nonlinear partial differential GP equation in one spatial variable.Method of solution: The time-dependent GP equation is solved by the split-step Crank–Nicolson method by discretizing in space and time. The discretized equation is then solved by propagation in imaginary time over small time steps. The method yields the solution of stationary problems.Program summary (ii), (iv), (viii), (x)Program title: imag2d, imagaxi, real2d, realaxi.Title of electronic files in C: (imag2d.c and imag2d.h), (imagaxi.c and imagaxi.h), (real2d.c and real2d.h), (realaxi.c and realaxi.h).Title of electronic files in Fortran 90: imag2d.f90, imagaxi.f90, real2d.f90, realaxi.f90.Maximum RAM memory: 1 GB for the supplied programs.Programming language used: OpenMP C and Fortran 90.Typical running time: Hour on a modern four-core PC.Nature of physical problem: These programs are designed to solve the time-dependent nonlinear partial differential GP equation in two spatial variables.Method of solution: The time-dependent GP equation is solved by the split-step Crank–Nicolson method by discretizing in space and time. The discretized equation is then solved by propagation in imaginary time over small time steps. The method yields the solution of stationary problems.Program summary (iii), (ix)Program title: imag3d, real3d.Title of electronic files in C: (imag3d.c and imag3d.h), (real3d.c and real3d.h).Title of electronic files in Fortran 90: imag3d.f90, real3d.f90.Maximum RAM memory: 1 GB for the supplied programs.Programming language used: OpenMP C and Fortran 90.Typical running time: Few hours on a modern four-core PC.Nature of physical problem: These programs are designed to solve the time-dependent nonlinear partial differential GP equation in three spatial variables.Method of solution: The time-dependent GP equation is solved by the split-step Crank–Nicolson method by discretizing in space and time. The discretized equation is then solved by propagation in imaginary time over small time steps. The method yields the solution of stationary problems.

Full Text