Distributed Memory Computer Cluster Research Articles

The parallelisation of implicit time domain solvers is essential for applications requiring large, highly over-sampled meshes that exceed the memory limitations of standalone workstations (1). Prime application examples include large arrays of small antennas (2), and on- chip interconnects for state of the art integrated circuits (IC) (3). The need for highly over sampled meshes in these applications can be illustrated by example. Current on-chip IC interconnect widths can be as low as 22nm or less, yet cover an area > 100(mm) 2 and transmit signals with fundamental frequencies in the range DC-10GHz. Typically, transient efiects involving high frequencies are of the most interest, but even this can still require meshes with ¢x < (‚0=10 6 ), i.e., four to flve orders of magnitude smaller than for typical FDTD meshes (neglecting considerations relating to material properties). It is not practical to use explicit FDTD solvers for such meshes, even in parallel, because the Courant-Friedrichs-Lewy (CFL) stability criteria enforces a commensurate reduction in the time step size for numerical reasons. While implicit FDTD solvers, such as ADI-FDTD, ofier freedom from the CFL stability criteria, it has been presumed that parallel implementations on all but the most specialised architectures (4) would be of little beneflt due to the high communication overhead. However, we have been able to show that this is not the case (5). In this Paper we will present an overview of our work to date on the parallelisation of implicit time domain methods, including parallel ADI-FDTD (1,5) and parallel ADI-BOR-FDTD (6) on both symmetric multiprocessor (SMP) and distributed memory computer clusters (DMCC). We do not parallelise the tri-diagonal matrix solver itself, because each 2D matrix must have more than 4,000 (40,000) elements per direction before this becomes e-cient for SMP (DMCC) ma- chines. This requires 3D domains at are at or beyond current memory limits for state of the art machines. Instead, we employ domain decomposition and solve multiple, smaller, tri-diagonal ma- trix systems in parallel. Carefully organised data exchanges avoid unnecessary double-handling of data during communication, improving the parallel algorithm performance. We will describe our domain decomposition scheme, and show results for small and large domains, comparing par- allel FDTD and parallel ADI-FDTD and parallel BOR-FDTD with parallel ADI-BOR-FDTD. We demonstrate e-cient solutions of large full 3D meshes with 8 billion mesh cells for paral- lel ADI-FDTD. Since machines with SMP and DMCC architectures are widely available, our demonstration of parallel speed up represents an important step forward for the application of implicit time domain solvers for large, highly oversampled meshes with ¢x < 10 i2 ‚. We expect that our parallelisation approach can be adopted for related implicit FDTD methods.

An efficient marching-on-in-time (MOT) scheme is presented for solving electric, magnetic, and combined field integral equations pertinent to the analysis of transient electromagnetic scattering from perfectly conducting surfaces residing in an unbounded homogenous medium. The proposed scheme is the extension of the frequency-domain adaptive integral/pre-corrected fast-Fourier transform (FFT) method to the time domain. Fields on the scatterer that are produced by space-time sources residing on its surface are computed: 1) by locally projecting, for each time step, all sources onto a uniform auxiliary grid that encases the scatterer; 2) by computing everywhere on this grid the transient fields produced by the resulting auxiliary sources via global, multilevel/blocked, space-time FFTs; 3) by locally interpolating these fields back onto the scatterer surface. As this procedure is inaccurate when source and observer points reside close to each other; and 4) near fields are computed classically, albeit (pre-)corrected, for errors introduced through the use of global FFTs. The proposed scheme has a computational complexity and memory requirement of O(N/sub t/N/sub s/log/sup 2/N/sub s/) and O(N/sub s//sup 3/2/) when applied to quasiplanar structures, and of O(N/sub t/N/sub s//sup 3/2/log/sup 2/N/sub s/) and O(N/sub s//sup 2/) when used to analyze scattering from general surfaces. Here, N/sub s/ and N/sub t/ denote the number of spatial and temporal degrees of freedom of the surface current density. These computational cost and memory requirements are contrasted to those of classical MOT solvers, which scale as O(N/sub t/N/sub s//sup 2/) and O(N/sub s//sup 2/), respectively. A parallel implementation of the scheme on a distributed-memory computer cluster that uses the message-passing interface is described. Simulation results demonstrate the accuracy, efficiency, and the parallel performance of the implementation.

Distributed Memory Computer Cluster Research Articles

Related Topics

Articles published on Distributed Memory Computer Cluster

Challenges and opportunities for the simulation of calcium waves on modern multi-core and many-core parallel computing platforms.

Scalable Hierarchical Parallel Algorithm for the Solution of Super Large-Scale Sparse Linear Equations

OpenCL‐based acceleration of the FDTD method in computational electromagnetics

Scattering and absorption properties of polydisperse wavelength-sized particles covered with much smaller grains

Direct pore-level modeling of incompressible fluid flow in porous media

Parallel implementation of the ADI‐FDTD method

Parallelisation of Implicit Time Domain Methods: Progress with ADI-FDTD

Parallel implementation of a semidefinite programming solver based on CSDP on a distributed memory cluster

Parallel Implementation of a Semidefinite Programming Solver Based on CSDP on a Distributed Memory Cluster

Time Domain Adaptive Integral Method for Surface Integral Equations

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Distributed Memory Computer Cluster Research Articles

Related Topics

Articles published on Distributed Memory Computer Cluster

Challenges and opportunities for the simulation of calcium waves on modern multi-core and many-core parallel computing platforms.

Scalable Hierarchical Parallel Algorithm for the Solution of Super Large-Scale Sparse Linear Equations

OpenCL‐based acceleration of the FDTD method in computational electromagnetics

Scattering and absorption properties of polydisperse wavelength-sized particles covered with much smaller grains

Direct pore-level modeling of incompressible fluid flow in porous media

Parallel implementation of the ADI‐FDTD method

Parallelisation of Implicit Time Domain Methods: Progress with ADI-FDTD

Parallel implementation of a semidefinite programming solver based on CSDP on a distributed memory cluster

Parallel Implementation of a Semidefinite Programming Solver Based on CSDP on a Distributed Memory Cluster

Time Domain Adaptive Integral Method for Surface Integral Equations