Abstract
A widespread attention has been paid in parallelizing algorithms for computationally intensive applications. In this paper, we propose a new parallel Matrix multiplication on the Hex-cell interconnection network. The proposed algorithm has been evaluated and compared with sequential algorithm in terms of speedup, and efficiency using IMAN1, where a set of simulation runs, carried out on different input data distributions with different sizes. Thus, simulation results supported the theoretical analysis and meet the expectations in which they show good performance in terms of speedup and efficiency.
Highlights
Matrix multiplication is commonly used in many areas like graph theory, residue-level protein folding [4], numerical algorithms, digital image processing and others
IMAN1 Zaina cluster is used to conduct our experiments and open MPI library is used in our implementation of the following parallel matrix multiplication algorithms; and the experimental runs on a dual quad core intel xeon Cpu with smp, 16 gb ram, where the software specification is conducted on scientific linux 6.4 with open mpi 1.5.4, C and C++ compiler
It shows information about the expected size of the input data that can be assigned for each group in a lucky-case partitioning, when applying the parallel matrix multiplication on the Hex-Cell interconnection network
Summary
Matrix multiplication is commonly used in many areas like graph theory, residue-level protein folding [4], numerical algorithms, digital image processing and others. Working with matrix multiplication algorithm of huge matrices requires a lot of computation time where the complexity time for sequential matrix multiplication algorithm is O (n3), where n is the dimension of the matrix. Because higher computational throughputs are required with the applications, many parallel algorithms based on sequential algorithms are developed to improve the performance of matrix multiplication algorithm. There a lot of improvement [7, 8] done on sequential algorithms to follow the big requirements but still has shown a limitation in performance. In common parallel matrix multiplication algorithms used decomposition of matrices depends on the number of processors available in the interconnection network [10, 9]. During execution process of matrix multiplication, each processor calculates a partial multiplication result using the sub matrices that are currently accessed by it. When the multiplication is completed, the coordinator processor assembles and generates the complete matrix multiplication result
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Computer Science and Information Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.