Pipelined Coordinate Rotation Digital Computer Research Articles

Practical implementation of deep neural networks (DNNs) demands significant hardware resources, necessitating high computational power and memory bandwidth. While existing field-programmable gate array (FPGA)–based DNN accelerators are primarily optimized for fast single-task performance, cost, energy efficiency, and overall throughput are crucial considerations for their practical use in various applications. This article proposes a performance-centric pipeline Coordinate Rotation Digital Computer (CORDIC)–based MAC unit and implements a scalable CORDIC-based DNN architecture that is area- and power-efficient and has high throughput. The CORDIC-based neuron engine uses bit-rounding to maintain input-output precision and minimal hardware resource overhead. The results demonstrate the versatility of the proposed pipelined MAC, which operates at 460 MHz and allows for higher network throughput. A software-based implementation platform evaluates the proposed MAC operation’s accuracy for more extensive neural networks and complex datasets. The DNN accelerator with parameterized and modular layer-multiplexed architecture is designed. Empirical evaluation through Pareto analysis is used to improve the efficiency of DNN implementations by fixing the arithmetic precision and optimal pipeline stages. The proposed architecture utilizes layer-multiplexing, a technique that effectively reuses a single DNN layer to enhance efficiency while maintaining modularity and adaptability for integrating various network configurations. The proposed CORDIC MAC-based DNN architecture is scalable for any bit-precision network size, and the DNN accelerator is prototyped using the Xilinx Virtex-7 VC707 FPGA board, operating at 66 MHz. The proposed design does not use any Xilinx macros, making it easily adaptable for ASIC implementation. Compared with state-of-the-art designs, the proposed design reduces resource use by 45% and power consumption by 4× without sacrificing performance. The accelerator is validated using the MNIST dataset, achieving 95.06% accuracy, only 0.35% less than other cutting-edge implementations.

Read full abstract

The use of RISC-based embedded processors aimed at low cost and low power is becoming an increasingly popular ecosystem for both hardware and software development. High-performance yet low-power embedded processors may be attained via the use of hardware acceleration and Instruction Set Architecture (ISA) extension. Recent publications of AI have demonstrated the use of Coordinate Rotation Digital Computer (CORDIC) as a dedicated low-power solution for solving nonlinear equations applied to Neural Networks (NN). This paper proposes ISA extension to support floating-point CORDIC, providing efficient hardware acceleration for mathematical functions. A new DMA-based ISA extension approach integrated with a pipeline CORDIC accelerator is proposed. The CORDIC ISA extension is directly interfaced with a standard processor data path, allowing efficient implementation of new trigonometric ALU-based custom instructions. The proposed DMA-based CORDIC accelerator can also be used to perform repeated array calculations, offering a significant speedup over software implementations. The proposed accelerator is evaluated on Intel Cyclone-IV FPGA as an extension to Nios processor. Experimental results show a significant speedup of over three orders of magnitude compared with software implementation, while applied to trigonometric arrays, and outperforms the existing commercial CORDIC hardware accelerator.

Read full abstract

Pipelined Coordinate Rotation Digital Computer Research Articles

Related Topics

Articles published on Pipelined Coordinate Rotation Digital Computer

An Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networks

Research and Implementation of a Numerical Control Oscillator with Improved Pipelined CORDIC Algorithm

CORDIC Hardware Acceleration Using DMA-Based ISA Extension

Low latency pipelined CORDIC-like rotator architecture

A Modified CORDIC FPGA Implementation for Wave Generation

CORDIC‐based window implementation to minimise area and pipeline depth

Reconfigurable Design of Pipelined CORDIC Processor for Digital Sine-Cosine

Novel architecture for QAM modulator–demodulator and its generalization to multicarrier modulation

Virtually scaling-free adaptive CORDIC rotator

Efficient implementations of pipelined CORDIC based IIR digital filters using fast orthonormal μ-rotations

An efficient CORDIC array structure for the implementation of discrete cosine transform

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Pipelined Coordinate Rotation Digital Computer Research Articles

Related Topics

Articles published on Pipelined Coordinate Rotation Digital Computer

An Empirical Approach to Enhance Performance for Scalable CORDIC-Based Deep Neural Networks

Research and Implementation of a Numerical Control Oscillator with Improved Pipelined CORDIC Algorithm

CORDIC Hardware Acceleration Using DMA-Based ISA Extension

Low latency pipelined CORDIC-like rotator architecture

A Modified CORDIC FPGA Implementation for Wave Generation

CORDIC‐based window implementation to minimise area and pipeline depth

Reconfigurable Design of Pipelined CORDIC Processor for Digital Sine-Cosine

Novel architecture for QAM modulator–demodulator and its generalization to multicarrier modulation

Virtually scaling-free adaptive CORDIC rotator

Efficient implementations of pipelined CORDIC based IIR digital filters using fast orthonormal μ-rotations

An efficient CORDIC array structure for the implementation of discrete cosine transform