Abstract

A recent trend in high performance computing (HPC) is to use networks of workstations (NOW) as a cheaper alternative to massively parallel multiprocessors or supercomputers. In such parallel systems (NOW's) individual workstations are connected through widely used communication standard networks and co-operate to solve one large problem. Every workstation is treated similarly as a processing element in a conventional multiprocessor system. To make the whole system appear to the applications as a single parallel computing engine (a virtual parallel system), run-time environments such as PVM (Parallel virtual machine), MPI (Message passing interfaces) are often used to provide an extra layer of abstraction. In this paper, we discuss a new performance evaluation method on the example of multidimensional DFFT (Discrete Fast Fourier Transform) in a NOW's based on Intel's personal computers.

Highlights

  • With the availability of cheap personal computers, workstations and networking devises, the recent trend is to connect a number of such workstations to solve computation-intensive tasks in parallel on such clusters

  • That means a complexity Cp is a function only of parallel algorithm calculation. Such assumption could be real in some centralised multiprocessor systems but not in networks of workstations (NOW)’s

  • Distributed computing was reborn as a kind of “lazy parallelism”

Read more

Summary

Effective parallel algorithms

There has been an increasing interest in the use of networks of workstations (cluster) connected together by high – speed networks for solving large computation-intensive problems [1, 6, 14, 15, 19]. This trend is mainly driven by the cost effectiveness of such systems as compared to large multiprocessor systems with tightly coupled processors and memories. The workstations can be connected using different network technologies such as off the shelf devices like Ethernet to specialised networks Such networks and the associated software and protocols introduce latency and throughput limitations thereby increasing the execution time of cluster – based computation. To do this it is necessary to understand the concrete application problem, the data domain, the used algorithm and the functional flow of activities in a given application

Decomposition strategies
Control decomposition
The theoretical part – The discrete Fourier Transform
Perfect parallel decomposition
Domain decomposition
The results
Performance evaluation
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call