CeMux: Maximizing the Accuracy of Stochastic Mux Adders and an Application to Filter Design
Stochastic computing (SC) is a low-cost computational paradigm that has promising applications in digital filter design, image processing, and neural networks. Fundamental to these applications is the weighted addition operation, which is most often implemented by a multiplexer (mux) tree. Mux-based adders have very low area but typically require long bitstreams to reach practical accuracy thresholds when the number of summands is large. In this work, we first identify the main contributors to mux adder error. We then demonstrate with analysis and experiment that two new techniques, precise sampling and full correlation, can target and mitigate these error sources. Implementing these techniques in hardware leads to the design of CeMux (Correlation-enhanced Multiplexer), a stochastic mux adder that is significantly more accurate and uses much less area than traditional weighted adders. We compare CeMux to other SC and hybrid designs for an electrocardiogram filtering case study that employs a large digital filter. One major result is that CeMux is shown to be accurate even for large input sizes. CeMux's higher accuracy leads to a latency reduction of 4× to 16× over other designs. Furthermore, CeMux uses about 35% less area than existing designs, and we demonstrate that a small amount of accuracy can be traded for a further 50% reduction in area. Finally, we compare CeMux to a conventional binary design and we show that CeMux can achieve a 50% to 73% area reduction for similar power and latency as the conventional design but at a slightly higher level of error.
- Addendum
34
- 10.1007/s12652-019-01431-x
- Sep 5, 2019
- Journal of Ambient Intelligence and Humanized Computing
Nowadays, optimal and intelligent design approaches are vital in almost all areas of engineering. Scientists and engineers are attempting to make frameworks and models more proficient and intelligent. This paper deals with a detailed investigation on design of various digital filters using optimization algorithms. Generally digital filters are classified into two types which are FIR and IIR filters and are again classified into one dimensional, two dimensional and three dimensional filters for signal, image and video respectively. The design of a digital filter that satisfies all the required conditions perfectly is a challenging factor. So, apart from the conventional mathematical methods, optimization algorithms can be used to design optimal digital filters. IIR Filters are infinite impulse response filter; they have impulse response of infinite duration. FIR Filters are finite impulse response filters; they have impulse response of finite duration. In this paper we have discussed the design of various optimal digital filters based on various optimization algorithms, for processing of signal, image and video. The design of digital filters based on Evolutionary algorithms and swarm intelligence algorithms like Genetic Algorithm, Particle Swarm Optimization, Artificial Bee Colony Optimization, Cuckoo Search Algorithm, Differential Evolution, Gravitational Search, Harmony Search, Spiral Optimization, teaching–learning based optimization, wind driven optimization, hybridization of optimization algorithm are presented.
- Research Article
1
- 10.1080/00207219008920352
- Nov 1, 1990
- International Journal of Electronics
The design of a digital filter is presented. This filter is envisaged as replacing analogue bandpass filters for the band 312 kHz to 55 2kHz which corresponds to supergroup 2 of the International Telegraph and Telephone Consultative Committee (CCITT) frequency plan. The design starts by considering the transformation of an elliptic low-pass analogue filter into a bandpass analogue filter which is then converted by bilinear transformation into the digital filter equivalent. The digital filter was simulated on a microcomputer and the results compare favourably with the results of Gray and Markel (1976) who have done similar work in the 3 kHz to 4 kHz band.
- Research Article
16
- 10.1109/jetcas.2023.3243604
- Mar 1, 2023
- IEEE Journal on Emerging and Selected Topics in Circuits and Systems
Stochastic computing (SC) is an alternative computing paradigm that processes data in the form of uniform bit-streams. SC is fault-tolerant and can compute on small, efficient circuits. However, SC is primarily used in scientific research, and its practical implementations for end-users are rare. Digital sound source localization (SSL) is a useful signal processing technique that locates speakers using multiple microphones. SC has not been integrated into SSL in practice or theory. In this work, for the first time to the best of our knowledge, we implement an SSL algorithm in the stochastic domain and develop a functional SC-based sound source localizer. The practical part of this work shows that the proposed stochastic circuit does not depend on conventional analog-to-digital conversion and can process data in the form of pulse-width-modulated (PWM) signals. The proposed SC design consumes up to 39% less area than the conventional binary design. It can also consume less power depending on the computational accuracy, for example, 6% less power consumption for 3-bit inputs. We propose a new cross-correlation (CC) design based on the state-of-the-art Sobol bit-streams for further area and power saving. The proposed design utilizes a <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">MUX</monospace> unit for bit-stream generation. It saves the area footprint up to 64% and the power consumption up to 82% compared to the counter-based SC design of CC, which relies on a comparator for bit-stream generation. The presented stochastic circuits, are not limited to SSL and are readily applicable to other practical applications such as radar ranging, wireless location, sonar direction finding, beamforming, and sensor calibration. The project’s source code is made available for public access.
- Research Article
195
- 10.1109/tnnls.2020.3009047
- Aug 5, 2020
- IEEE Transactions on Neural Networks and Learning Systems
Neural networks (NNs) are effective machine learning models that require significant hardware and energy consumption in their computing process. To implement NNs, stochastic computing (SC) has been proposed to achieve a tradeoff between hardware efficiency and computing performance. In an SC NN, hardware requirements and power consumption are significantly reduced by moderately sacrificing the inference accuracy and computation speed. With recent developments in SC techniques, however, the performance of SC NNs has substantially been improved, making it comparable with conventional binary designs yet by utilizing less hardware. In this article, we begin with the design of a basic SC neuron and then survey different types of SC NNs, including multilayer perceptrons, deep belief networks, convolutional NNs, and recurrent NNs. Recent progress in SC designs that further improve the hardware efficiency and performance of NNs is subsequently discussed. The generality and versatility of SC NNs are illustrated for both the training and inference processes. Finally, the advantages and challenges of SC NNs are discussed with respect to binary counterparts.
- Research Article
7
- 10.1007/s00034-017-0656-9
- Sep 11, 2017
- Circuits, Systems, and Signal Processing
The design of digital IIR filter as a single-objective optimization problem using evolutionary algorithms has gained much attention in the previous years. In this paper, the design of filter is treated as a multi-objective problem by simultaneously minimizing the magnitude response error, linear phase response error and optimal order within the stability constraints. The global search technique, predator–prey optimization (PPO), has been applied to design the digital IIR filter. The global search technique has been hybridized with binary successive approximation (BSA)-based evolutionary search method for exploring the search space locally. The relative performance of PPO and hybrid PPO has been evaluated by applying these techniques to standard mathematical test functions. The above-proposed hybrid search technique has been applied to achieve the solution for multi-parameter and multi-objective optimization problem of low-pass (LP), high-pass (HP), band-pass (BP) and band-stop (BS) digital IIR filter design. The results obtained from the proposed technique are compared with the results of other algorithms applied by other researchers for the design of digital IIR filter.
- Research Article
1
- 10.5573/jsts.2020.20.5.436
- Oct 31, 2020
- JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE
Stochastic computing, an approximate computing method using bitstreams, has attracted attention as an alternative to deterministic computing. Stochastic computing circuits are known to perform complex calculations with high density through probability calculations. Herein, we describe the design of an accurate and compact arithmetic circuit based on stochastic computing. First, we propose a simple technique to change the output of a random number generator that is an integral part of stochastic computing for stochastic multipliers and adders. Compared with conventional designs, the results indicate that the proposed design reduces power and area and enhances the performance. This method uses a fully connected cube network and does not lose accuracy without overhead. Subsequently, when applying this design to image processing in the real world, a 63% area reduction and 95% power savings are achieved when compared to an accurate operator. Therefore, it is clear that the proposed design is optimized for energy-efficient hardware designs with high imprecision tolerance.
- Conference Article
26
- 10.1109/iscas.1990.112147
- May 1, 1990
A simulated annealing optimization algorithm is used in the design of finite impulse response (FIR) filters and fifth-order LDI all-pole digital filters. Using simulated annealing, sin(x)/x precompensating scheme based on both the FIR and LDI structures is presented. The design process begins with a random set of finite precision coefficients for the filter, making no attempt to obtain a good initial starting coefficient. The use of simulated annealing allows the use of nonclassical transfer functions in the design of digital filters, i.e. it allows for the design of arbitrary magnitude response filters. >
- Conference Article
15
- 10.1145/2902961.2902978
- May 18, 2016
Discrete Fourier Transformation (DFT)/Fast Fourier Transformation (FFT) are the widely used techniques in numerous modern signal processing applications. In general, because of their inherent multiplication-intensive characteristics, the hardware implementations of DFT/FFT usually require a large amount of hardware resource, which limits their applications in area-constraint scenarios. To overcome this challenge, this paper, for the first time, proposes area-efficient error-resilient DFT designs using stochastic computing. By leveraging low-complexity stochastic multipliers, two types of stochastic DFT design are presented with significant reduction in overall area. Analysis results show that compared with the conventional design, the proposed two 256-point stochastic DFT designs achieve 76% and 62% reduction in area, respectively. More importantly, these stochastic DFT designs also show much stronger error-resilience, which is very attractive in nanoscale CMOS era.
- Conference Article
44
- 10.1109/aspdac.2017.7858405
- Jan 1, 2017
Stochastic Computing (SC) is an alternative design paradigm particularly useful for applications where cost is critical. SC has been applied to neural networks, as neural networks are known for their high computational complexity. However previous work in this area has critical limitations such as the fully-parallel architecture assumption, which prevent them from being applicable to recent ones such as convolutional neural networks, or ConvNets. This paper presents the first SC architecture for ConvNets, shows its feasibility, with detailed analyses of implementation overheads. Our SC-ConvNet is a hybrid between SC and conventional binary design, which is a marked difference from earlier SC-based neural networks. Though this might seem like a compromise, it is a novel feature driven by the need to support modern ConvNets at scale, which commonly have many, large layers. Our proposed architecture also features hybrid layer composition, which helps achieve very high recognition accuracy. Our detailed evaluation results involving functional simulation and RTL synthesis suggest that SC-ConvNets are indeed competitive with conventional binary designs, even without considering inherent error resilience of SC.
- Research Article
1
- 10.29109/http-gujsc-gazi-edu-tr.335872
- Dec 22, 2017
- Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji
Digital filters, which are used in many signal processing applications, can be classified as recursive or nonrecursive. Since nonrecursive digital filters can provide inherent stability and exact linear phase characteristic, they have an important place in the literature. In this paper, a new hybrid function is proposed for the design of nonrecursive digital filters. This new window was obtained by combining two different window functions known as Kaiser and Von-Hann in the literature. First of all, the effects of two independent parameters of the proposed window function on the digital filter characteristic are analysed in terms of minimum stopband attenuation and transition bandwidth. Later, comparative examples for different filter lengths are given to compare the performance of the proposed window function in the design of digital filter with the well-known other two parameter window functions. Simulation results demonstrate that the proposed window function can provide a better filter design than Kaiser-Hamming, Saramaki, Kaiser, Dolph-Chebychev, Cosh, Exponential, and Gaussian window functions.
- Conference Article
1
- 10.1109/tpec54980.2022.9750824
- Feb 28, 2022
Extraction of instantaneous symmetrical components plays a vital role in the operation of grid-connected converters. Conventionally, extraction of instantaneous symmetrical components is achieved by transforming the signal into a rotating reference frame and passing it through a digital filter. This work focuses on the design and performance of various digital filters adopted for the extraction of instantaneous symmetrical components. The design considerations and associated trade-offs involved in the design of digital filters are first outlined. Subsequently, the design procedure involved in commonly adopted filtering strategies is reviewed and their performance under various operating conditions is studied. In addition, this work also analyzes the impact of filter design on closed-loop applications such as phase locked loops.
- Research Article
11
- 10.1016/j.sigpro.2021.108040
- Feb 12, 2021
- Signal Processing
Low-area and accurate inner product and digital filters based on stochastic computing
- Conference Article
1
- 10.1109/isocc53507.2021.9613856
- Oct 6, 2021
Artificial Neural Networks (ANN) have shown their superiority in many applications of academia and industry. However, the hardware architecture of ANN requires a lot of operation units, which results in a high area and high-power overhead. On the other hand, the Stochastic Computing (SC) method has been proven as an efficient way to achieve low-power computing with a small area overhead. Therefore, many SC-based ANNs have been proposed in recent years. However, due to stochastic bit-stream computing, the conventional SC-based ANN designs suffer from low computing accuracy. In this work, we use the parallel counter (PC) to replace the SC-based multiply-accumulator (MAC) to solve the accuracy problem in conventional SC-based ANN designs. Besides, we propose a finite state machine (FSM)-based activation function to improve the efficiency of the data representation change in SC-based ANN computing. Compared with the conventional SC-based ANN designs, our proposed architecture can improve computing accuracy by 82.2%. Besides, our proposed architecture can reduce 95.8% area cost and 94.2% power consumption over than non-SC-based ANN design, which achieves higher hardware efficiency.
- Conference Article
6
- 10.1109/pacrim.2015.7334871
- Aug 1, 2015
The compact arithmetic units in stochastic computing can potentially lower the implementation cost with respect to silicon area and power consumption. In addition, stochastic computing provides inherent tolerance of transient errors at the cost of a less efficient signal encoding. In this paper, a novel FIR filter design using the stochastic approach based on multiplexers are proposed. The required stochastic sequence length is determined for different signal resolutions by matching the performance of the proposed FIR filter with that of the conventional binary design. Silicon area, power and maximum clock frequency are obtained to evaluate the throughput per area (TPA) and the energy per operation (EPO). For equivalent filtering performance, the stochastic FIR filter underperforms in terms of TPA and EPO compared to the conventional binary design, albeit with some advantages in circuit area and power consumption. The stochastic design, however, shows a graceful degradation in performance with a significant reduction in energy consumption as the stochastic sequences are shortened. The fault-tolerance of the stochastic circuit is compared with that of the binary circuit equipped with triple modular redundancy. It is shown that the stochastic circuit is more reliable than the conventional binary design and its triple modular redundancy (TMR) implementation with unreliable voters, but it is less reliable than the binary TMR implementation when the voters are fault-free.
- Conference Article
22
- 10.1109/icassp.2016.7472940
- Mar 1, 2016
In recent years stochastic computing (SC) is re-gaining increasing attention for its unique advantages on low hardware cost and strong error resilience that are the key metrics for nanoscale CMOS era. However, the potential deployment of SC in practical applications is impeded by the long latency of sequential bit-stream and large complexity of pseudo random number generator (PRNG). Aiming to mitigate these challenges, this paper exploits the design space for hardware-efficient stochastic computing with a case study on 4-point discrete cosine transformation (DCT). First, an efficient compensation mechanism is proposed to solve the scaling problem of SC system. Then, two approaches, namely Splitting-Shuffling (SS) and PRNG sharing techniques are proposed to reduce the overall area and processing latency, respectively. Analysis results show that, sustaining the same computing accuracy, the joint use of the proposed approaches leads to 44% reduction in area and 49% reduction on latency than conventional SC design, respectively.