A benchmark study based on the parallel computation of the vector outer-product A = uvT operation

Rudnei Dias Da Cunha

doi:10.1002/(sici)1096-9128(199708)9:8<803::aid-cpe266>3.0.co;2-q

Abstract

In this paper we benchmark the performance of the Cray T3D, IBM 9076 SP/1 and Intel Paragon XP/S parallel computers, using implementations of parallel algorithms for the computation of the vector outer-product A = uvT operation. The vector outer-product operation, although very simple in nature, requires the computation of a large number of floating-point operations and its parallelization induces a great level of communication between the processors. It is thus suited to measure the relative speed of the processor, memory subsystem and network capabilities of a parallel computer. It should not be considered a ‘toy problem’, since it arises in numerical methods in the context of the solution of systems of non-linear equations – still a difficult problem to solve. We present algorithms for both the explicit shared-memory and message-passing programming models together with theoretical computation models for those algorithms. Actual experiments were run on those computers, using Fortran 77 implementations of the algorithms. The results obtained with these experiments show that due to the high degree of communication between the processors one needs a parallel computer with fast communications and carefully implemented data exchange routines. The theoretical computation model allows prediction of the speed-up to be obtained for some problem size on a given number of processors. © 1997 John Wiley & Sons, Ltd.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A benchmark study based on the parallel computation of the vector outer-product A = uvT operation

Abstract

Talk to us

Similar Papers

More From: Concurrency: Practice and Experience

Lead the way for us

Journal: Concurrency: Practice and Experience	Publication Date: Aug 1, 1997
Citations: 1

Similar Papers

Parallel Numeric Algorithms On Faster Computers

Scalable Computing Practice and Experience | VOL. 5

03 Jan 2001
Scalable Computing Practice and Experience | VOL. 5

Parallel Numerics and Applications
...
Scalable Computing Practice and Experience | VOL. 5
, et. al. ...
01 Jan 2002
Scalable Computing Practice and Experience | VOL. 5

Background compensation and an active-camera motion tracking algorithm
R Gupta ... M.D Theys
-
R Gupta, et. al.R Gupta ... M.D Theys
11 Aug 1997
11 Aug 1997

Uniform Circuit Complexity
José Luis Balcázar ... Joaquim Gabarró
-
José Luis Balcázar, et. al.José Luis Balcázar ... Joaquim Gabarró
01 Jan 1990
01 Jan 1990

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A benchmark study based on the parallel computation of the vector outer-product A = uvT operation

Abstract

Talk to us

Similar Papers

More From: Concurrency: Practice and Experience