Exploring the Scalability of OpenCL Coarse Grained Parallelism on Cloud FPGAs

Jhanani Thiagarajan,Arnab A Purkayastha,Atul Patil,Hamed Tabkhi

doi:10.1109/socc49529.2020.9524765

Abstract

OpenCL programming ability combined with FPGAs pipelined parallelism have enabled high-performance execution and power-efficient solutions for massively parallel applications. This paper provides an exhaustive study on the scalability of OpenCL coarse-grain parallelism, Compute Unit (CU) replication on cloud FPGAs. This work demonstrates that for many applications there is an optimum number of CUs to achieve the maximum performance benefits with respect to memory bandwidth, memory conflicts introduced by CU replication and available FPGA resources. At the same time, the paper provides a source-code template and an optimized front-end design tool to explore and identify the optimum CU number for a given application, while hiding the programming and exploration difficulties from programmers. Our experimental results on 15 applications taken from the Xilinx SDAccel v2017.4 suite and the Rodinia Benchmark Suite v3.1 show a speedup of 6.2X, bandwidth improvement of 3.5X with a mere 1.04X power and less than 10% resource utilization on average. In addition, our tool results in a 31% improvement in the total design synthesis time for an illustrative Histogram application.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploring the Scalability of OpenCL Coarse Grained Parallelism on Cloud FPGAs

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

On the Evaluation of Coarse Grained Parallelism in AV1 Video Coding
Panos K. Papadopoulos ... Petr Saloun
-
Panos K. Papadopoulos, et. al.Panos K. Papadopoulos ... Petr Saloun
01 Sep 2018
01 Sep 2018

Coarse grained parallel quantum genetic algorithm for reconfiguration and service restoration of electric power networks
Ahmed Adel Hieba ... Nabil H Abbasy
International Journal of Hybrid Intelligent Systems | VOL. 15
Ahmed Adel Hieba, et. al.Ahmed Adel Hieba ... Nabil H Abbasy
22 Aug 2019
International Journal of Hybrid Intelligent Systems | VOL. 15

Coarse Grained Parallelism Optimization for Multicore Architectures: The ALMA Project Approach
George Goulas ... Christos Gogos
-
George Goulas, et. al.George Goulas ... Christos Gogos
01 Jan 2013
01 Jan 2013

Finding Coarse Grained Parallelism in Computational Geometry Algorithms
Volodymyr Beletskyy
-
Volodymyr BeletskyyVolodymyr Beletskyy
01 Jan 2003
01 Jan 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring the Scalability of OpenCL Coarse Grained Parallelism on Cloud FPGAs

Abstract

Talk to us

Similar Papers