Dataflow Implementation Research Articles

Specialized FPGA implementations can deliver higher performance and greater power efficiency than embedded CPU or GPU implementations for real-time image processing. Programming challenges limit their wider use, because the implementation of FPGA architectures at the register transfer level is time consuming and error prone. Existing software languages supported by high-level synthesis (HLS), although providing a productivity improvement, are too general purpose to generate efficient hardware without the use of hardware-specific code optimizations. Such optimizations leak hardware details into the abstractions that software languages are there to provide, and they require knowledge of FPGAs to generate efficient hardware, such as by using language pragmas to partition data structures across memory blocks. This article presents a thorough account of the Rathlin image processing language (RIPL), a high-level image processing domain-specific language for FPGAs. We motivate its design, based on higher-order algorithmic skeletons, with requirements from the image processing domain. RIPL’s skeletons suffice to elegantly describe image processing stencils, as well as recursive algorithms with nonlocal random access patterns. At its core, RIPL employs a dataflow intermediate representation. We give a formal account of the compilation scheme from RIPL skeletons to static and cyclostatic dataflow models to describe their data rates and static scheduling on FPGAs. RIPL compares favorably to the Vivado HLS OpenCV library and C++ compiled with Vivado HLS. RIPL achieves between 54 and 191 frames per second (FPS) at 100MHz for four synthetic benchmarks, faster than HLS OpenCV in three cases. Two real-world algorithms are implemented in RIPL: visual saliency and mean shift segmentation. For the visual saliency algorithm, RIPL achieves 71 FPS compared to optimized C++ at 28 FPS. RIPL is also concise, being 5x shorter than C++ and 111x shorter than an equivalent direct dataflow implementation. For mean shift segmentation, RIPL achieves 7 FPS compared to optimized C++ on 64 CPU cores at 1.1, and RIPL is 10x shorter than the direct dataflow FPGA implementation.

Read full abstract

Dataflow Implementation Research Articles

Related Topics

Articles published on Dataflow Implementation

STIFT: A Spatio-Temporal Integrated Folding Tree for Efficient Reductions in Flexible DNN Accelerators

Management of Climate Resilience: Exploring the Potential of Digital Twin Technology, 3D City Modelling, and Early Warning Systems.

Fast and accurate variable batch size convolution neural network training on large scale distributed systems

Aplikasi E-Transaksi dan Pelaporan Kegiatan di Galeri Investasi Politeknik Piksi Ganesha

Towards hybrid supercomputing architectures

Implementation of Data Flow, Predictive Model, and Data Visualization in Corbion Process Analysis System

Exploring mental models of the right to informational self-determination of office workers in Germany

Fast Realization of 3-D Space-Time Correlation Sea Clutter of Large-Scale Sea Scene Based on FPGA: From EM Model to Statistical Model

Classification of Farmer’s Eligibility as Recipients of Subsidized Fertilizer Assistance with C4.5 Algorithm

Megaphone

Optimizing data-flow implementations for inter-organizational processes

RIPL

StreamDrive: a Dynamic Dataflow Framework for Clustered Embedded Architectures

Algorithm for Identification of Infinite Clusters Based on Minimal Finite Automaton

A Data-Flow Soft-Core Processor for Accelerating Scientific Calculation on FPGAs

DaSH: A benchmark suite for hybrid dataflow and shared memory programming models

Embedded Multi-Core Systems Dedicated to Dynamic Dataflow Programs

Fast and Efficient Implementation of Forward Clarke Transform's Algorithm Data Flow on FPGA for Green Technology

Mapping Parameterized Cyclo-static Dataflow Graphs onto Configurable Hardware

Securing skeletal systems with limited performance penalty: The muskel experience

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Dataflow Implementation Research Articles

Related Topics

Articles published on Dataflow Implementation

STIFT: A Spatio-Temporal Integrated Folding Tree for Efficient Reductions in Flexible DNN Accelerators

Management of Climate Resilience: Exploring the Potential of Digital Twin Technology, 3D City Modelling, and Early Warning Systems.

Fast and accurate variable batch size convolution neural network training on large scale distributed systems

Aplikasi E-Transaksi dan Pelaporan Kegiatan di Galeri Investasi Politeknik Piksi Ganesha

Towards hybrid supercomputing architectures

Implementation of Data Flow, Predictive Model, and Data Visualization in Corbion Process Analysis System

Exploring mental models of the right to informational self-determination of office workers in Germany

Fast Realization of 3-D Space-Time Correlation Sea Clutter of Large-Scale Sea Scene Based on FPGA: From EM Model to Statistical Model

Classification of Farmer’s Eligibility as Recipients of Subsidized Fertilizer Assistance with C4.5 Algorithm

Megaphone

Optimizing data-flow implementations for inter-organizational processes

RIPL

StreamDrive: a Dynamic Dataflow Framework for Clustered Embedded Architectures

Algorithm for Identification of Infinite Clusters Based on Minimal Finite Automaton

A Data-Flow Soft-Core Processor for Accelerating Scientific Calculation on FPGAs

DaSH: A benchmark suite for hybrid dataflow and shared memory programming models

Embedded Multi-Core Systems Dedicated to Dynamic Dataflow Programs

Fast and Efficient Implementation of Forward Clarke Transform's Algorithm Data Flow on FPGA for Green Technology

Mapping Parameterized Cyclo-static Dataflow Graphs onto Configurable Hardware

Securing skeletal systems with limited performance penalty: The muskel experience