Distributed Memory Multiprocessor Systems Research Articles

We have developed a high performance version of the Monte Carlo particle transport simulation code MC4. The original application code, developed in Visual Basic for Applications (VBA) for Microsoft Excel, was first rewritten in the C programming language for improving code portability. Several pseudo-random number generators have been also integrated and studied. The new MC4 version was then parallelized for shared and distributed-memory multiprocessor systems using the Message Passing Interface. Two parallel pseudo-random number generator libraries (SPRNG and DCMT) have been seamlessly integrated. The performance speedup of parallel MC4 has been studied on a variety of parallel computing architectures including an Intel Xeon server with 4 dual-core processors, a Sun cluster consisting of 16 nodes of 2 dual-core AMD Opteron processors and a 200 dual-processor HP cluster. For large problem size, which is limited only by the physical memory of the multiprocessor server, the speedup results are almost linear on all systems. We have validated the parallel implementation against the serial VBA and C implementations using the same random number generator. Our experimental results on the transport and energy loss of electrons in a water medium show that the serial and parallel codes are equivalent in accuracy. The present improvements allow for studying of higher particle energies with the use of more accurate physical models, and improve statistics as more particles tracks can be simulated in low response time.

Read full abstract

Many problems today need the computing power that is only available by using large‐scaleparallel processing. For a significant number of these problems, the density of theglobal communications between the individual processors dominates the performance ofthe whole parallel implementation on a distributed memory multiprocessor system. In thesecases, the design of the interconnection network for the processors is known to play asignificant part in the efficient implementation of real problems. Important criteria to optimisethe efficiency of a configuration are the maximum and average distance a message hasto travel between processors. Minimum path systems are irregular multiprocessor computerarchitectures which optimise these criteria. These architectures provide an efficient alternativeto the more common regular topologies for solving real applications in parallel. Thispaper presents new results for two combinatorial problems that occur during the generationof these optimal irregular configurations. These are: (1) The design of the optimum interconnection network between the processors for configurationscontaining up to 128 processors. (2) The design of the routing tables to provide the optimal routing of messages withinthese irregular networks. The paper shows how these combinatorial problems have been solved, using genetic algorithmsfor the first problem and a random local search procedure for the second. It alsoincludes a comparison with the results obtained for regular topologies, for example: hypercubes,tori, and rings.

Read full abstract

Distributed Memory Multiprocessor Systems Research Articles

Related Topics

Articles published on Distributed Memory Multiprocessor Systems

Software implementation of the conjugate gradient method for shared and distributed memory multiprocessor systems

Analytical Estimation of the Scalability of Iterative Numerical Algorithms on Distributed Memory Multiprocessors

Range query processing on single and multi GPU environments

Adaptive thermo-fluid moving boundary computations for interfacial dynamics

Parallel WaveCluster: A linear scaling parallel clustering algorithm implementation with application to very large datasets

Reconfigurable mesh-based inter-chip optical interconnection network for distributed-memory multiprocessor system

Parallelization of a Monte Carlo particle transport simulation code

Applying Data Mapping Techniques to Vector DSPs

Numerical simulation of an electromagnetic field by parallelized 3D AIBO-FDTD

The performance of parallel iterative solvers

Investigation on 3-D implicit FDTD method for parallel processing

Speedup in solving differential equations on clusters of workstations

Performance evaluation of a list scheduling algorithm in distributed memory multiprocessor systems

A class of parallel multiple‐front algorithms on subdomains

Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization

Indefinitely preconditioned conjugate gradient method for large sparse equality and inequality constrained quadratic problems

A framework for integrating data alignment, distribution, and redistribution in distributed memory multiprocessors

Dynamic task scheduling using online optimization

Optimisation of irregular multiprocessor computer architectures using genetic algorithms

A concurrent network architecture for cost-efficient parallel computing using workstation clusters

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Distributed Memory Multiprocessor Systems Research Articles

Related Topics

Articles published on Distributed Memory Multiprocessor Systems

Software implementation of the conjugate gradient method for shared and distributed memory multiprocessor systems

Analytical Estimation of the Scalability of Iterative Numerical Algorithms on Distributed Memory Multiprocessors

Range query processing on single and multi GPU environments

Adaptive thermo-fluid moving boundary computations for interfacial dynamics

Parallel WaveCluster: A linear scaling parallel clustering algorithm implementation with application to very large datasets

Reconfigurable mesh-based inter-chip optical interconnection network for distributed-memory multiprocessor system

Parallelization of a Monte Carlo particle transport simulation code

Applying Data Mapping Techniques to Vector DSPs

Numerical simulation of an electromagnetic field by parallelized 3D AIBO-FDTD

The performance of parallel iterative solvers

Investigation on 3-D implicit FDTD method for parallel processing

Speedup in solving differential equations on clusters of workstations

Performance evaluation of a list scheduling algorithm in distributed memory multiprocessor systems

A class of parallel multiple‐front algorithms on subdomains

Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization

Indefinitely preconditioned conjugate gradient method for large sparse equality and inequality constrained quadratic problems

A framework for integrating data alignment, distribution, and redistribution in distributed memory multiprocessors

Dynamic task scheduling using online optimization

Optimisation of irregular multiprocessor computer architectures using genetic algorithms

A concurrent network architecture for cost-efficient parallel computing using workstation clusters