Optimization of Parallel FDTD Computations Based on Program Macro Data Flow Graph Transformations

Adam Smyk,Marek Tudruj

doi:10.5772/16807

Adam Smyk, Marek Tudruj

Open Access

PDF Available

https://doi.org/10.5772/16807

Copy DOI

Export

Save

Cite

Publication Date: Jun 24, 2011

License type: cc-by-nc-sa

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

This chapter concerns numerical problems that are solved by parallel regular computations performed in rectangular meshes that span over irregular computational areas. Such parallel problems are more difficult to be optimized than problems concerning regular areas since the problem cannot be solved by a simple geometrical decomposition of the computational area. Usually, a kind of step-by-step algorithm has to be designed to balance parallel computations and communication in and between executive processors. The Finite Difference Time Domain (FDTD) simulation of electromagnetic wave propagation in irregular computational area, numerical linear algebra or VLSI layout design belong to this class of computational problems solved by unstructured computational algorithms (Lin, 1996) with irregular data patterns. Some heuristic methods are known that enable graphs partitioning necessary to solve such problems (NP-complete problem (Garey et al., 1976)), but generally two kinds of such methods are used: direct methods (Khan et al., 1995) and iterative methods (Khan et al., 1995; Kerighan & Lin, 1970; Kirkpatrick et al., 1983; Karypis & Kumar, 1995; Dutt & Deng, 1997). Direct methods are usually based on the min-cut optimization (Stone & Bokhari, 1978). The iterative methods are mainly based on extensions of the algorithms of Kernighan-Lin (Kerighan & Lin, 1970), next improved by Fidducia-Mattheyses methods (FM)(Fiduccia & Mattheyses, 1982). There are alsomany kinds of various programgraph partitioning packages like JOSTLE (Walshaw et al., 1995), SCHOTCH (Scotch, 2010) and METIS (Metis, 2008) etc. All of them enable performing efficient graph partitioning but there are two unresolved problems that have been found out. In the case of very irregular graphs, partitioning algorithms used in these packages can produce a partition that can be divided into two or more graph parts placed in various disjointed locations of the computofational area. As it follows from observed practice, there are no prerequisites to create such disjoint partitions, because in almost all cases it increases a total communication volume during execution in distributed systems. The second disadvantage is that the partitioning methods mentioned above do not take into account any architectural requirements of a target computational system. It is very important especially in heterogeneous systems, where proper load balancing allows efficient exploiting all computational resources and simultaneously, it allows reducing the total time of computations. 11

Full Text