Comparison among Performance Measures for Parallel Matrix Multiplication Algorithms

Halil Snopce,Azir Aliu

doi:10.19026/rjaset.7.818

Abstract

In this study we analyze how to make a proper selection for the given matrix-matrix multiplication operation and to decide which is the best suitable algorithm that generates a high throughput with a minimum time, a comparison analysis and a performance evaluation for some algorithms is carried out using the identical performance parameters.

Highlights

Most of the parallel algorithms for matrix multiplication use matrix decomposition that is based on the number of processors available
Matrix multiplication can be performed in O (n) time on a mesh with wraparound connections and n×n processors (Cannon, 1969); in O time on a three dimensional mesh of trees with n3 processors (Leighton, 1992); in O time on a hypercube or shuffle exchange network with n3 processors (Dekel et al, 1983)
The advantage of using the linear transformation in designing the systolic array for matrix multiplication: If one uses theorem 2, it is possible to find the number of PEs in the corresponding systolic array

Summary

Introduction

Most of the parallel algorithms for matrix multiplication use matrix decomposition that is based on the number of processors available. This includes the systolic algorithm (Choi et al, 1992), Cannon’s algorithm (Alpatov et al, 1997), Fox’s and Otto’s Algorithm (Agarwal et al, 1995), PUMMA (Parallel Universal Matrix Multiplication) (Choi et al, 1994), SUMMA (Scalable Universal Matrix Multiplication) (Cannon, 1969) and DIMMA (Distribution Independent Matrix Multiplication) (Chtchelkanova et al, 1995). The standard method for n×n matrix multiplication uses O (n3) operations (multiplications). The aim is to develop highly parallel algorithms that have the cost less than O (n3)

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Research Journal of Applied Sciences, Engineering and Technology	Publication Date: Jun 5, 2014
Citations: 11	License type: cc-by

R Discovery Prime

R Discovery Prime

Comparison among Performance Measures for Parallel Matrix Multiplication Algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Research Journal of Applied Sciences, Engineering and Technology

Lead the way for us

Similar Papers

Parallel matrix algorithm autotuner on multi-core architecture
Zhang Ji-Lin ... Wan Jian
-
Zhang Ji-Lin, et. al. Zhang Ji-Lin ... Wan Jian
01 Dec 2010
01 Dec 2010

Decomposition Models of Parallel Algorithms
Michal Hanuliak
American Journal of Networks and Communications | VOL. 3
Michal HanuliakMichal Hanuliak
01 Jan 2014
American Journal of Networks and Communications | VOL. 3

A New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure

journal of sciences islamic republic of iran | VOL. 21

01 Sep 2010
journal of sciences islamic republic of iran | VOL. 21

Parallel algorithm for matrix multiplication based on De Bruijn network
Zhao-Quan Cai ... Zong-Hui Zheng
Journal of Computer Applications | VOL. 29
Zhao-Quan Cai, et. al.Zhao-Quan Cai ... Zong-Hui Zheng
06 May 2009
Journal of Computer Applications | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparison among Performance Measures for Parallel Matrix Multiplication Algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Research Journal of Applied Sciences, Engineering and Technology