Efficient FPGA Implementation of OpenCL High-Performance Computing Applications via High-Level Synthesis

Fahad Bin Muslim,Luciano Lavagno,Liang Ma,Mehdi Roozmeh

doi:10.1109/access.2017.2671881

Fahad Bin Muslim, Luciano Lavagno + Show 2 more

Open Access

https://doi.org/10.1109/access.2017.2671881

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2017
Citations: 80	License type: CC BY 3.0

Affiliation: Polytechnic University of Turin

Abstract

FPGA-based accelerators have recently evolved as strong competitors to the traditional GPU-based accelerators in modern high-performance computing systems. They offer both high computational capabilities and considerably lower energy consumption. High-level synthesis (HLS) can be used to overcome the main hurdle in the mainstream usage of the FPGA-based accelerators, i.e., the complexity of their design flow. HLS enables the designers to program an FPGA directly by using high-level languages, e.g., C, C++, SystemC, and OpenCL. This paper presents an HLS-based FPGA implementation of several algorithms from a variety of application domains. A performance comparison in terms of execution time, energy, and power consumption with some high-end GPUs is performed as well. The algorithms have been modeled in OpenCL for both GPU and FPGA implementation. We conclude that FPGAs are much more energy-efficient than GPUs in all the test cases that we considered. Moreover, FPGAs can sometimes be faster than GPUs by using an FPGA-specific OpenCL programming style and utilizing a variety of appropriate HLS directives.

Highlights

Modern electronic devices like smart phones are required to perform a variety of tasks ranging from simpler text messaging to more computationally intensive multimedia operations
field programmable gate array (FPGA) IMPLEMENTATION This section of the paper gives a brief overview of the Open Computing Language (OpenCL) programming framework, including its platform and memory model, followed by a detailed description of the design flow starting from the OpenCL code and terminating with final FPGA implementation
This paper performs an extensive analysis of the prospect of using high-level synthesis for implementing FPGA-based accelerators in modern high performance computing (HPC) systems

Summary

Introduction

Modern electronic devices like smart phones are required to perform a variety of tasks ranging from simpler text messaging to more computationally intensive multimedia operations This has resulted in the development of heterogeneous system architectures in modern system-on-chip (SoC) designs. Such systems mitigate the issues encountered by multicore scaling (using several homogeneous cores), stemming mainly from the so called memory wall and Von Neumann bottleneck [1], [2]. Graphical processing units (GPUs) offer higher floating point throughput, a favorable architecture for data parallelism and higher memory bandwidth than processors These properties make them good candidates to be used as accelerators in modern high performance computing (HPC) systems [4]. The HPC systems using GPU-based accelerators are inefficient in terms of power consumption [5]

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient FPGA Implementation of OpenCL High-Performance Computing Applications via High-Level Synthesis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Energy-efficient hardware design based on high-level synthesis

-

01 Jan 2017
01 Jan 2017

Hi-ClockFlow: Multi-Clock Dataflow Automation and Throughput Optimization in High-Level Synthesis
Tingyuan Liang ... Wei Zhang
-
Tingyuan Liang, et. al.Tingyuan Liang ... Wei Zhang
01 Nov 2019
01 Nov 2019

Correlated Multi-objective Multi-fidelity Optimization for HLS Directives Design
Qi Sun ... Tinghuan Chen
-
Qi Sun, et. al.Qi Sun ... Tinghuan Chen
01 Feb 2021
01 Feb 2021

Correlated Multi-objective Multi-fidelity Optimization for HLS Directives Design
Qi Sun ... Jianli Chen
ACM Transactions on Design Automation of Electronic Systems | VOL. 27
Qi Sun, et. al.Qi Sun ... Jianli Chen
08 Mar 2022
ACM Transactions on Design Automation of Electronic Systems | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient FPGA Implementation of OpenCL High-Performance Computing Applications via High-Level Synthesis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access