Abstract

The Wideband Fast Multipole Method for the two-dimensional complex Helmholtz equation program is updated. The new version uses significantly less memory than the original version and uses almost constant memory for all the wavenumbers k when the number of particles is given. The CPU time is also improved slightly. Additionally, the memory leak problems and errors from external variables when it is used in an iterative solver are fixed. The new version wFMM and other useful codes are available from the website http://fastmultipole.org/.Manuscript Title: Revision of wFMM – A Wideband Fast Multipole Method for the two-dimensional complex Helmholtz equationAuthors: Min Hyung Cho and Wei CaiProgram Title: 2D-WFMMJournal Reference:Catalogue identifier: AEHI_v2_0Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEHI_v2_0.htmlProgram obtainable from: CPC Program Library, Queenʼs University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 4669No. of bytes in distributed program, including test data, etc.: 46 849Programming language: CComputer: AnyOperating system: Any operating system with gcc compiler. For the multi-thread computing, the gcc version 4.4 or newer is recommendedRAM: Depending on the number of particles N and wavenumber kNumber of processors used: Multi-core processors with shared memoryKeywords: Wideband Fast Multipole Method, Helmholtz equation, Fast solverClassification: 4.8, 4.12External routines/libraries: OpenMP (http://openmp.org/wp/)Subprograms used: NoneCatalogue identifier of previous version: AEHI_v1_0Journal reference of previous version: Computer Physics Communications 181 (12) (2010) 2086Does the new version supersede the previous version?: YesNature of problem: Evaluate the interaction between N particles governed by the fundamental solution of 2D complex Helmholtz equation with wide range of wavenumber k.Solution method: Multilevel Fast Multipole Algorithm in a hierarchical quad-tree data structure with a cutoff level, which combines low frequency method and high frequency method.Reasons for the new version: Improve the efficiency of the program including memory usage, repeated use in an iterative solver or other programs, and a minor speed up.Summary of revisions: First, the tree searching method in downward pass is modified to accommodate the higher level of tree structure and save memory. The original version used a queue data structure to visitthe tree from the top to bottom level. However, in this new version, the queue method is replaced with a simple recursion algorithm. As a result, the code uses less memory, and the high level tree refinement becomes more efficient. Secondly, memory leaks and external variables that caused problems in the repeated usage in an iterative solver, are fixed. The original version had no problem as stand alone software. However, when it was used repeatedly in other codes, the code did not release the memory after it was called. This causes significant problems for large matrix systems. In the new version, all the memory allocations are tracked and freed as soon as they become unnecessary. Consequently, the new version uses almost constant memory for all wavenumbers k when the number of particles N is fixed. The new version does not stack up the memory in an iterative solver. Also, in the original version, several external variables were not initialized when the code was called multiple times. This resulted in incorrect numerical solutions and a memory allocation error in worst cases. In the new version, external variables are converted to internal ones. The code is also tested by calling it multiple times in other solvers (results will be published soon). As a minor improvement, some of the complex number operations are simplified, and the running time is slightly reduced. Finally, a simple makefile is added in the package for the easy compile. “>>make” will compile one of the test cases described in the readme.rtf file.The new version of wFMM is compared with the original version in Tables 1 and 2 with the same parameters presented in the original paper. Numerical tests are conducted with a machine consisting of two quad-core Intel Xeon 3.00 GHz processors, 32 GB memory, and a gcc version 4.5.1 running on Fedora release 11. All the results in the tables can be obtained by modifying the testrun.c file. Both tables show the significant memory savings and minor CPU time reduction for both real and complex k for the number of particles N=490000.Running time: The CPU time depends on the number of particles N and its distribution, wavenumber k, and number of cores in a machine. The computation time increases as NlogN.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.