Field Programmable Gate Array Design Research Articles

Over the past few years, several applications have been extensively exploiting the advantages of deep learning, in particular when using convolutional neural networks (CNNs). The intrinsic flexibility of such models makes them widely adopted in a variety of practical applications, from medical to industrial. In this latter scenario, however, using consumer Personal Computer (PC) hardware is not always suitable for the potential harsh conditions of the working environment and the strict timing that industrial applications typically have. Therefore, the design of custom FPGA (Field Programmable Gate Array) solutions for network inference is gaining massive attention from researchers and companies as well. In this paper, we propose a family of network architectures composed of three kinds of custom layers working with integer arithmetic with a customizable precision (down to just two bits). Such layers are designed to be effectively trained on classical GPUs (Graphics Processing Units) and then synthesized to FPGA hardware for real-time inference. The idea is to provide a trainable quantization layer, called Requantizer, acting both as a non-linear activation for neurons and a value rescaler to match the desired bit precision. This way, the training is not only quantization-aware, but also capable of estimating the optimal scaling coefficients to accommodate both the non-linear nature of the activations and the constraints imposed by the limited precision. In the experimental section, we test the performance of this kind of model while working both on classical PC hardware and a case-study implementation of a signal peak detection device running on a real FPGA. We employ TensorFlow Lite for training and comparison, and use Xilinx FPGAs and Vivado for synthesis and implementation. The results show an accuracy of the quantized networks close to the floating point version, without the need for representative data for calibration as in other approaches, and performance that is better than dedicated peak detection algorithms. The FPGA implementation is able to run in real time at a rate of four gigapixels per second with moderate hardware resources, while achieving a sustained efficiency of 0.5 TOPS/W (tera operations per second per watt), in line with custom integrated hardware accelerators.

In this paper, we propose a new Modified Laplacian Vector Median Filter (MLVMF) for real-time denoising complex images corrupted by "salt and pepper" impulsive noise. The method consists of two rounds with three steps each: the first round starts with the identification of pixels that may be contaminated by noise using a Modified Laplacian Filter. Then, corrupted pixels pass a neighborhood-based validation test. Finally, the Vector Median Filter is used to replace noisy pixels. The MLVMF uses a 5 × 5 window to observe the intensity variations around each pixel of the image with a rotation step of π/8 while the classic Laplacian filters often use rotation steps of π/2 or π/4. We see better identification of noise-corrupted pixels thanks to this rotation step refinement. Despite this advantage, a high percentage of the impulsive noise may cause two or more corrupted pixels (with the same intensity) to collide, preventing the identification of noise-corrupted pixels. A second round is then necessary using a second set of filters, still based on the Laplacian operator, but allowing focusing only on the collision phenomenon. To validate our method, MLVMF is firstly tested on standard images, with a noise percentage varying from 3% to 30%. Obtained performances in terms of processing time, as well as image restoration quality through the PSNR (Peak Signal to Noise Ratio) and the NCD (Normalized Color Difference) metrics, are compared to the performances of VMF (Vector Median Filter), VMRHF (Vector Median-Rational Hybrid Filter), and MSMF (Modified Switching Median Filter). A second test is performed on several noisy chest x-ray images used in cardiovascular disease diagnosis as well as COVID-19 diagnosis. The proposed method shows a very good quality of restoration on this type of image, particularly when the percentage of noise is high. The MLVMF provides a high PSNR value of 5.5% and a low NCD value of 18.2%. Finally, an optimized Field-Programmable Gate Array (FPGA) design is proposed to implement the proposed method for real-time processing. The proposed hardware implementation allows an execution time equal to 9 ms per 256 × 256 color image.

Field Programmable Gate Array Design Research Articles

Related Topics

Articles published on Field Programmable Gate Array Design

Quantization-Aware NN Layers with High-throughput FPGA Implementation for Edge AI

A New Recursive Trigonometric Technique for FPGA-Design Implementation

A Domain-Specific Accelerator for Ultralow Latency Market Data Distribution System

FCRoute: A Fast FPGA Connection Router Using Soft Routing-Space Pruning Algorithm

Hardware Design and Implementation of FPGA Controlled Seven-Level Reduced Switch MLI

Design of novel fractional order FPGA based reactor protection and safety controllers for ACP1000 nuclear power plant in LabVIEW

An integrated and scalable experimental system for nitrogen-vacancy ensemble magnetometry.

A Sparse CNN Accelerator for Eliminating Redundant Computations in Intra- and Inter-Convolutional/Pooling Layers

Importance of Edge Computing in Critical Manufacturing Systems: FPGA Implementation

New Real-Time Impulse Noise Removal Method Applied to Chest X-ray Images.

Constrained Optimization of FPGA Design for Spaceborne InSAR Processing

Development of a Signal Processing Software for Scintillation Detectors and Implementation on an FPGA for Fast Sensing

A Comprehensive Methodology to Optimize FPGA Designs via the Roofline Model

HXDP

Field-programmable gate array design of image encryption and decryption using Chua’s chaotic masking

STT-MRAM-Based Multicontext FPGA for Multithreading Computing Environment

FPGA Implementation for the Sigmoid with Piecewise Linear Fitting Method Based on Curvature Analysis

Resources and Power Efficient FPGA Accelerators for Real-Time Image Classification.

FPGA Accelerator for Real-Time Non-Line-of-Sight Imaging

XDNN: Inference for Deep Convolutional Neural Networks

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Field Programmable Gate Array Design Research Articles

Related Topics

Articles published on Field Programmable Gate Array Design

Quantization-Aware NN Layers with High-throughput FPGA Implementation for Edge AI

A New Recursive Trigonometric Technique for FPGA-Design Implementation

A Domain-Specific Accelerator for Ultralow Latency Market Data Distribution System

FCRoute: A Fast FPGA Connection Router Using Soft Routing-Space Pruning Algorithm

Hardware Design and Implementation of FPGA Controlled Seven-Level Reduced Switch MLI

Design of novel fractional order FPGA based reactor protection and safety controllers for ACP1000 nuclear power plant in LabVIEW

An integrated and scalable experimental system for nitrogen-vacancy ensemble magnetometry.

A Sparse CNN Accelerator for Eliminating Redundant Computations in Intra- and Inter-Convolutional/Pooling Layers

Importance of Edge Computing in Critical Manufacturing Systems: FPGA Implementation

New Real-Time Impulse Noise Removal Method Applied to Chest X-ray Images.

Constrained Optimization of FPGA Design for Spaceborne InSAR Processing

Development of a Signal Processing Software for Scintillation Detectors and Implementation on an FPGA for Fast Sensing

A Comprehensive Methodology to Optimize FPGA Designs via the Roofline Model

HXDP

Field-programmable gate array design of image encryption and decryption using Chua’s chaotic masking

STT-MRAM-Based Multicontext FPGA for Multithreading Computing Environment

FPGA Implementation for the Sigmoid with Piecewise Linear Fitting Method Based on Curvature Analysis

Resources and Power Efficient FPGA Accelerators for Real-Time Image Classification.

FPGA Accelerator for Real-Time Non-Line-of-Sight Imaging

XDNN: Inference for Deep Convolutional Neural Networks