Efficient Hardware Architecture Research Articles

AbstractThe field of computer vision is characterized by computationally intensive algorithms and techniques with strict real‐time requirements. Field programmable gate arrays (FPGAs) are based on a concurrent paradigm which allows the design of efficient hardware architectures and has positioned FPGAs as an ideal device for implementing compute‐intensive applications. For this reason, FPGA technology has had a great impact in areas such as computer vision, where one of the main objectives for researchers working in this field is to create efficient automatic object recognition systems. Therefore, the need to provide undergraduates with the necessary skills to design FPGA‐based object recognition systems is evident. With this aim in mind, it is essential that specialization courses related to the design of these systems include the required resources for the student to apply the theoretical knowledge in solving practical problems. In this article, we present a development tool designed to help students, teachers, and researchers during the design‐modeling‐implementation process of object recognition systems based on FPGAs. The proposed tool operates under a modular approach as this facilitates the working on any of the phases of a recognition system and it is considered as a hybrid because the other phases can be developed using a software language. An empirical evaluation involving undergraduates enrolled in a Computer Engineering program was conducted to create a hardware architecture for the DAISY descriptor that uses the homogeneous features of objects immersed in images to produce an efficient representation. By considering similar descriptors such as Scale‐Invariant Feature Transform (SIFT) and Histogram of Oriented Gradients (HOG), DAISY is computed by convolving orientation maps instead of using weighted sums of gradient norms, which offers the same kind of invariance at a lower computational cost for the dense case. The results obtained during such an evaluation indicated that students consider this FPGA‐based tool to be an alternative to receiving practical training on designing systems for solving problems related to the area of object recognition.

Read full abstract

Spiking neural networks (SNNs) have emerged as a hardware efficient architecture for classification tasks. The challenge of spike-based encoding has been the lack of a universal training mechanism performed entirely using spikes. There have been several attempts to adopt the powerful backpropagation (BP) technique used in non-spiking artificial neural networks (ANNs): (1) SNNs can be trained by externally computed numerical gradients. (2) A major advancement towards native spike-based learning has been the use of approximate BP using spike-time dependent plasticity with phased forward/backward passes. However, the transfer of information between such phases for gradient and weight update calculation necessitates external memory and computational access. This is a challenge for standard neuromorphic hardware implementations. In this paper, we propose a stochastic SNN based back-prop (SSNN-BP) algorithm that utilizes a composite neuron to simultaneously compute the forward pass activations and backward pass gradients explicitly with spikes. Although signed gradient values are a challenge for spike-based representation, we tackle this by splitting the gradient signal into positive and negative streams. The composite neuron encodes information in the form of stochastic spike-trains and converts BP weight updates into temporally and spatially local spike coincidence updates compatible with hardware-friendly resistive processing units. Furthermore, we characterize the quantization effect of discrete spike-based weight update to show that our method approaches BP ANN baseline with sufficiently long spike-trains. Finally, we show that the well-performing softmax cross-entropy loss function can be implemented through inhibitory lateral connections enforcing a winner take all rule. Our SNN with a two-layer network shows excellent generalization through comparable performance to ANNs with equivalent architecture and regularization parameters on static image datasets like MNIST, Fashion-MNIST, Extended MNIST, and temporally encoded image datasets like Neuromorphic MNIST datasets. Thus, SSNN-BP enables BP compatible with purely spike-based neuromorphic hardware.

Read full abstract

Efficient Hardware Architecture Research Articles

Related Topics

Articles published on Efficient Hardware Architecture

A survey of graph convolutional networks (GCNs) in FPGA-based accelerators

Fault-Tolerant Operation of Bosonic Qubits with Discrete-Variable Ancillae

Quantized CNN-based efficient hardware architecture for real-time hand gesture recognition

A Low Complexity Cooperative Spectrum Sensor for Cognitive –Radio Network Based on Approximate Computing

9.1 µW keyword spotting processor based on optimized MFCC and small‐footprint TENet in 28‐nm CMOS

CNC: A lightweight architecture for Binary Ring-LWE based PQC

An FPGA‐based tool for supporting the design, modeling, and evaluation of hybrid object recognition systems on computer engineering courses

Enhancing Real-time Simultaneous Localization and Mapping with FPGA-based EKF-SLAM's Hardware Architecture

An efficient hardware architecture of integer motion estimation based on early termination and data reuse for Versatile video coding

Radix-4 CORDIC algorithm based low-latency and hardware efficient VLSI architecture for Nth root and Nth power computations

CRAFT: Criticality-Aware Fault-Tolerance Enhancement Techniques for Emerging Memories-Based Deep Neural Networks

A New ACD-OMP Accelerator With Clustered Computing Look-Ahead

Design and implementation of hardware-efficient architecture for saturation-based image dehazing algorithm

A temporally and spatially local spike-based backpropagation algorithm to enable training in hardware

Configurable Encryption and Decryption Architectures for CKKS-Based Homomorphic Encryption.

Accelerated FPGA-Based Vector Directional Filter for Real-Time Color Image Denoising with Enhanced Performance

Soft Decision Decoding with Cyclic Information Set and the Decoder Architecture for Cyclic Codes

Silicon Photonic Phase-Diverse Receiver Enabling Transmission of >Net 250 Gbps/λ Over 40 km for High-Speed and Low-Cost Short-Reach Optical Communications

An Area-Efficient Accelerator for Non-Maximum Suppression

Hardware Architecture of a QAM Receiver for Short-Range Optical Communications

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Efficient Hardware Architecture Research Articles

Related Topics

Articles published on Efficient Hardware Architecture

A survey of graph convolutional networks (GCNs) in FPGA-based accelerators

Fault-Tolerant Operation of Bosonic Qubits with Discrete-Variable Ancillae

Quantized CNN-based efficient hardware architecture for real-time hand gesture recognition

A Low Complexity Cooperative Spectrum Sensor for Cognitive –Radio Network Based on Approximate Computing

9.1 µW keyword spotting processor based on optimized MFCC and small‐footprint TENet in 28‐nm CMOS

CNC: A lightweight architecture for Binary Ring-LWE based PQC

An FPGA‐based tool for supporting the design, modeling, and evaluation of hybrid object recognition systems on computer engineering courses

Enhancing Real-time Simultaneous Localization and Mapping with FPGA-based EKF-SLAM's Hardware Architecture

An efficient hardware architecture of integer motion estimation based on early termination and data reuse for Versatile video coding

Radix-4 CORDIC algorithm based low-latency and hardware efficient VLSI architecture for Nth root and Nth power computations

CRAFT: Criticality-Aware Fault-Tolerance Enhancement Techniques for Emerging Memories-Based Deep Neural Networks

A New ACD-OMP Accelerator With Clustered Computing Look-Ahead

Design and implementation of hardware-efficient architecture for saturation-based image dehazing algorithm

A temporally and spatially local spike-based backpropagation algorithm to enable training in hardware

Configurable Encryption and Decryption Architectures for CKKS-Based Homomorphic Encryption.

Accelerated FPGA-Based Vector Directional Filter for Real-Time Color Image Denoising with Enhanced Performance

Soft Decision Decoding with Cyclic Information Set and the Decoder Architecture for Cyclic Codes

Silicon Photonic Phase-Diverse Receiver Enabling Transmission of &gt;Net 250 Gbps/λ Over 40 km for High-Speed and Low-Cost Short-Reach Optical Communications

An Area-Efficient Accelerator for Non-Maximum Suppression

Hardware Architecture of a QAM Receiver for Short-Range Optical Communications

Silicon Photonic Phase-Diverse Receiver Enabling Transmission of >Net 250 Gbps/λ Over 40 km for High-Speed and Low-Cost Short-Reach Optical Communications