High-level Synthesis Optimization Research Articles

The channel model is by far the most computing intensive part of the link level simulations of multiple-input and multiple-output (MIMO) fifth-generation new radio (5G NR) communication systems. Simulation effort further increases when using more realistic geometry-based channel models, such as the three-dimensional spatial channel model (3D-SCM). Channel emulation is used for functional and performance verification of such models in the network planning phase. These models use multiple finite impulse response (FIR) filters and have a very high degree of parallelism which can be exploited for accelerated execution on Field Programmable Gate Array (FPGA) and Graphics Processing Unit (GPU) platforms. This paper proposes an efficient re-configurable implementation of the 3rd generation partnership project (3GPP) 3D-SCM on FPGAs using a design flow based on high-level synthesis (HLS). It studies the effect of various HLS optimization techniques on the total latency and hardware resource utilization on Xilinx Alveo U280 and Intel Arria 10GX 1150 high-performance FPGAs, using in both cases the commercial HLS tools of the producer. The channel model accuracy is preserved using double precision floating point arithmetic. This work analyzes in detail the effort to target the FPGA platforms using HLS tools, both in terms of common parallelization effort (shared by both FPGAs), and in terms of platform-specific effort, different for Xilinx and Intel FPGAs. Compared to the baseline general-purpose central processing unit (CPU) implementation, the achieved speedups are 65X and 95X using the Xilinx UltraScale+ and Intel Arria FPGA platform respectively, when using a Double Data Rate (DDR) memory interface. The FPGA-based designs also achieved ~3X better performance compared to a similar technology node NVIDIA GeForce GTX 1070 GPU, while consuming ~4X less energy. The FPGA implementation speedup improves up to 173X over the CPU baseline when using the Xilinx UltraRAM (URAM) and High-Bandwidth Memory (HBM) resources, also achieving 6X lower latency and 12X lower energy consumption than the GPU implementation.

Recent embedded applications are widely used in several industrial domains such as automotive and multimedia systems. These applications are critical and complex, involving more computing resources and therefore increasing the power consumption of the system. Although performance still remains an important design metric, power consumption has become a critical factor for several systems, particularly after the increasing complexity of recent System-on-Chip (SoC) designs. Consequently, the whole computing domain is being forced to switch from a focus on high performance computation to energy-efficient computation. In addition to the time-to-market challenge, designers need to estimate, rapidly and accurately, both area occupation and power consumption of complex and diverse applications. High-Level Synthesis (HLS) has been emerged as an attractive solution for designers to address this challenge in order to explore a large number of design points at a high-level of abstraction. In this paper, we target FPGA-based accelerators. We propose HAPE, a high-level framework based on analytic models for area and power estimation without requiring register-transfer level (RTL) implementations. This technique allows to estimate the required FPGA resources and the power consumption at the source code level. The proposed models also enable a fast design space exploration (DSE) with different trade-offs through HLS optimization pragmas, including loop unrolling, pipelining, array partitioning, etc. The accuracy of our proposed models is evaluated by using a variety of synthetic benchmarks. Estimated power results are compared to real board measurements. The area and power estimation results are less than 5% of error compared to RTL implementations.

High-level Synthesis Optimization Research Articles

Related Topics

Articles published on High-level Synthesis Optimization

CollectiveHLS: A Collaborative Approach to High-Level Synthesis Design Optimization

FPGA Acceleration of 3GPP Channel Model Emulator for 5G New Radio

High Level Synthesis Optimizations of Road Lane Detection Development on Zynq-7000

High Level Design of a Flexible PCA Hardware Accelerator Using a New Block-Streaming Method

Memory-Based High-Level Synthesis Optimizations Security Exploration on the Power Side-Channel

Case study of an HEVC decoder application using high-level synthesis: intraprediction, dequantization, and inverse transform blocks

HAPE: A high-level area-power estimation framework for FPGA-based accelerators

COSMOS

High-Level Synthesis Optimization for Blocked Floating-Point Matrix Multiplication

Design of Synthesizable, Retimed Digital Filters Using FPGA Based Path Solvers with MCM Approach: Comparison and CAD Tool

Nested Loop Parallelization Using Polyhedral Optimization in High-Level Synthesis

A Novel Framework for Applying Multiobjective GA and PSO Based Approaches for Simultaneous Area, Delay, and Power Optimization in High Level Synthesis of Datapaths

Bit-Length Optimization Method for High-Level Synthesis Based on Non-linear Programming Technique

Interconnection optimization in data path allocation using minimal cost maximal flow algorithm

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

High-level Synthesis Optimization Research Articles

Related Topics

Articles published on High-level Synthesis Optimization

CollectiveHLS: A Collaborative Approach to High-Level Synthesis Design Optimization

FPGA Acceleration of 3GPP Channel Model Emulator for 5G New Radio

High Level Synthesis Optimizations of Road Lane Detection Development on Zynq-7000

High Level Design of a Flexible PCA Hardware Accelerator Using a New Block-Streaming Method

Memory-Based High-Level Synthesis Optimizations Security Exploration on the Power Side-Channel

Case study of an HEVC decoder application using high-level synthesis: intraprediction, dequantization, and inverse transform blocks

HAPE: A high-level area-power estimation framework for FPGA-based accelerators

COSMOS

High-Level Synthesis Optimization for Blocked Floating-Point Matrix Multiplication

Design of Synthesizable, Retimed Digital Filters Using FPGA Based Path Solvers with MCM Approach: Comparison and CAD Tool

Nested Loop Parallelization Using Polyhedral Optimization in High-Level Synthesis

A Novel Framework for Applying Multiobjective GA and PSO Based Approaches for Simultaneous Area, Delay, and Power Optimization in High Level Synthesis of Datapaths

Bit-Length Optimization Method for High-Level Synthesis Based on Non-linear Programming Technique

Interconnection optimization in data path allocation using minimal cost maximal flow algorithm