Impact of interconnection networks in a massively parallel FPGA architecture on a parallel reduction algorithm

Mouna Baklouti,Philippe Marquet,Jean Luc Dekeyser,Mohamed Abid

doi:10.1109/idt.2008.4802499

Abstract

As the size, hardware complexity, and programming diversity of parallel systems continue to evolve, the range of alternatives for implementing a task on these systems grows. Choosing a parallel algorithm and implementation becomes an important decision, and the choice has a significant impact on the execution time of the application. This paper focuses on the implementation of a SIMD parallel reduction algorithm in a massively parallel architecture on FPGA. In fact, parallel reduction is a common and important data parallel primitive. The impact of the interconnection network topology on the number of data transfers required to perform the computations is studied. This paper introduces also two flexible and parametric communication networks, integrated in a SIMD SoC architecture, to manage both regular and irregular communications. The programmer can choose one or both networks when configuring his architecture in order to choose the most appropriate one for a given application. The performance of executing the reduction algorithm on the proposed architecture is finally evaluated. The goal of this work is to highlight some implementation decisions that influence the overall performance of a parallel algorithm. We conclude that the massively parallel interconnection network used has a great impact on the performance of a data parallel algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Impact of interconnection networks in a massively parallel FPGA architecture on a parallel reduction algorithm

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Three-Dimensional Cartesian Mesh Generation Algorithm Based on the GPU Parallel Ray Casting Method
Tiechang Ma ... Tianbao Ma
Applied Sciences | VOL. 10
Tiechang Ma, et. al.Tiechang Ma ... Tianbao Ma
19 Dec 2019
Applied Sciences | VOL. 10

Numerical analysis of parallel implementation of the reorthogonalized ABS methods
Szabina Fodor ... Zoltán Németh
Central European Journal of Operations Research | VOL. 27
Szabina Fodor, et. al.Szabina Fodor ... Zoltán Németh
18 Jun 2018
Central European Journal of Operations Research | VOL. 27

Minimizing Communication Penalty of Triangular Solvers by Runtime Mesh Configuration and Workload Redistribution
Dianqin Wang ... Eleanor Chu
The Journal of Supercomputing | VOL. 14
Dianqin Wang, et. al.Dianqin Wang ... Eleanor Chu
01 Jan 1998
The Journal of Supercomputing | VOL. 14

Parallel Reductions: An Application of Adaptive Algorithm Selection
Hao Yu ... Francis Dang
-
Hao Yu, et. al.Hao Yu ... Francis Dang
01 Jan 2004
01 Jan 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Impact of interconnection networks in a massively parallel FPGA architecture on a parallel reduction algorithm

Abstract

Talk to us

Similar Papers