Real-time Embedded Applications Research Articles

Large-scale discrete convolution, well-known to be computationally intensive, is a fundamental algorithmic building block in many computer vision and artificial intelligence applications. This work presents a novel stochastic-based hardware architecture and design that computes discrete convolution based on the widely-used convolution theorem. Our approach has three advantages. First, it can achieve approximately <inline-formula><tex-math notation="LaTeX">${\mathrm O} (1)$</tex-math></inline-formula> in algorithmic complexity for any given absolute error bound <inline-formula><tex-math notation="LaTeX">$d$</tex-math></inline-formula> and any given input vector size <inline-formula><tex-math notation="LaTeX">$N$</tex-math></inline-formula> . This computing complexity, when compared with <inline-formula><tex-math notation="LaTeX">${\mathrm O} (N^2)$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">${\mathrm O} (N \log N)$</tex-math></inline-formula> operations for conventional multiplier-based and FFT-based architectures respectively, represents a significant improvement, although at the cost of degraded computing accuracy. Second, to achieve a given computing accuracy, our proposed stochastic-based convolution can be analytically proven to only require a moderate number of random samples, e.g., 788 random samples can achieve 95 percent accuracy at 99 percent confidence level for a convolution with <inline-formula><tex-math notation="LaTeX">$N=128$</tex-math></inline-formula> . Third, this proposed stochastic-based architecture is highly fault-tolerant because the information to be processed is encoded with a large ensemble of random samples. As such, the local perturbations of its computing accuracy will be dissipated globally, thus becoming inconsequential to the final overall results. We believe that, being highly scalable and energy efficient, our stochastic-based convolution architecture is well-suited for many real-time embedded applications, especially those perception-based computing tasks that are inherently fault-tolerant. In short, this work provides an elegant way to tradeoff between computing accuracy and computing performance/hardware efficiency for many real-world convolution-based applications.

Read full abstract

This paper presents sensor fusion techniques for systems where the process model is a function of the human input and, therefore, unpredictable. The system consists of free and user-driven motion regimes. The free regime can be modeled as a damped sinusoidal waveform, while the driven regime and the transitions between regimes do not respect any sort of probability, pattern, or sequence. The quantity of interest is the deflection of a clamped beam, measured using three sensor technologies: 1) strain gages; 2) infrared; and 3) Hall effect sensors. Experiments using infrared-based motion capture as reference measuring system show that: 1) none of the sensors present optimal performance for both motion regimes and 2) measurement errors of each sensor differ significantly according to the motion regime. These findings suggest the use of sensor fusion techniques with low processing cost, compatible with real-time embedded applications. Our solution is based on a multiple-model linear Kalman filter in combination with motion segmentation. The motion segmentation discriminates gestures according to the knowledge of their process model. This allows a more predictive estimation during periods of free motion, while relying on a less predictive approach for unknown user-driven signals. In addition, we propose a framework on evaluation and selection of process models for unpredictable signals. The implementation was compared with single-sensor and single-model filter designs. Results based on human subject data reveal that the proposed method improves the error covariance of the estimate by a factor of 2.2 for driven motions and 12.7 for free motions in comparison with single-sensor filter design.

Read full abstract

Real-time Embedded Applications Research Articles

Related Topics

Articles published on Real-time Embedded Applications

RFSOD: a lightweight single-stage detector for real-time embedded applications to detect small-size objects

Hardware support in a middleware for distributed and real-time embedded applications

Hardware-Efficient VLSI Design for Cascade Support Vector Machine with On-Chip Training and Classification Capability

FPGA Implementations of SVM Classifiers: A Review

A Comparative Study of Sorting Algorithms with FPGA Acceleration by High Level Synthesis

Robust and Large-Scale Convolution through Stochastic-Based Processing without Multipliers

A Resource-Limited Hardware Accelerator for Convolutional Neural Networks in Embedded Vision Applications

Hardware Architecture to Extract Feature Points for Object Recognition

Experimental Evaluation and Selection of Data Consistency Mechanisms for Hard Real-Time Applications on Multicore Platforms

Multiple-Model Linear Kalman Filter Framework for Unpredictable Signals

A new auto-switched chaotic system and its FPGA implementation

A Real-Time Flash Memory Storage System in Embedded Environment

Performance Optimization of the Fuzzy Rule Interpolation Method “FIVE”

Model-driven software synthesis for hard real-time applications with energy constraints

Performance evaluation of intel's quad core processors for embedded applications

Reconfigurable system for real-time embedded control applications

Worst-Case Flit and Packet Delay Bounds in Wormhole Networks on Chip

Synchronization through Communication in a Massively Parallel Processor Array

Synchronization through Communication in a Massively Parallel Processor Array

A general framework for certifying garbage collectors and their mutators

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Real-time Embedded Applications Research Articles

Related Topics

Articles published on Real-time Embedded Applications

RFSOD: a lightweight single-stage detector for real-time embedded applications to detect small-size objects

Hardware support in a middleware for distributed and real-time embedded applications

Hardware-Efficient VLSI Design for Cascade Support Vector Machine with On-Chip Training and Classification Capability

FPGA Implementations of SVM Classifiers: A Review

A Comparative Study of Sorting Algorithms with FPGA Acceleration by High Level Synthesis

Robust and Large-Scale Convolution through Stochastic-Based Processing without Multipliers

A Resource-Limited Hardware Accelerator for Convolutional Neural Networks in Embedded Vision Applications

Hardware Architecture to Extract Feature Points for Object Recognition

Experimental Evaluation and Selection of Data Consistency Mechanisms for Hard Real-Time Applications on Multicore Platforms

Multiple-Model Linear Kalman Filter Framework for Unpredictable Signals

A new auto-switched chaotic system and its FPGA implementation

A Real-Time Flash Memory Storage System in Embedded Environment

Performance Optimization of the Fuzzy Rule Interpolation Method “FIVE”

Model-driven software synthesis for hard real-time applications with energy constraints

Performance evaluation of intel's quad core processors for embedded applications

Reconfigurable system for real-time embedded control applications

Worst-Case Flit and Packet Delay Bounds in Wormhole Networks on Chip

Synchronization through Communication in a Massively Parallel Processor Array

Synchronization through Communication in a Massively Parallel Processor Array

A general framework for certifying garbage collectors and their mutators