Abstract

The use of RISC-based embedded processors aimed at low cost and low power is becoming an increasingly popular ecosystem for both hardware and software development. High-performance yet low-power embedded processors may be attained via the use of hardware acceleration and Instruction Set Architecture (ISA) extension. Recent publications of AI have demonstrated the use of Coordinate Rotation Digital Computer (CORDIC) as a dedicated low-power solution for solving nonlinear equations applied to Neural Networks (NN). This paper proposes ISA extension to support floating-point CORDIC, providing efficient hardware acceleration for mathematical functions. A new DMA-based ISA extension approach integrated with a pipeline CORDIC accelerator is proposed. The CORDIC ISA extension is directly interfaced with a standard processor data path, allowing efficient implementation of new trigonometric ALU-based custom instructions. The proposed DMA-based CORDIC accelerator can also be used to perform repeated array calculations, offering a significant speedup over software implementations. The proposed accelerator is evaluated on Intel Cyclone-IV FPGA as an extension to Nios processor. Experimental results show a significant speedup of over three orders of magnitude compared with software implementation, while applied to trigonometric arrays, and outperforms the existing commercial CORDIC hardware accelerator.

Highlights

  • In the last years, the complexity of embedded platform, such as Internet of Things (IoT) devices, has been increasing steadily with the conflicting requirements for high performance and real-time capabilities versus minimal amount of power and size

  • This paper proposes a floating-point Coordinate Rotation Digital Computer (CORDIC) accelerator aimed at extending the arithmetic logic unit (ALU) instruction set for a range of extensive computation transcendental functions, the trigonometric family

  • To other existing CORDIC accelerators, we propose a DMA-based Instruction Set Architecture (ISA) extension integrated with a pipeline CORDIC accelerator

Read more

Summary

Introduction

The complexity of embedded platform, such as Internet of Things (IoT) devices, has been increasing steadily with the conflicting requirements for high performance and real-time capabilities versus minimal amount of power and size. Extending the RISC ISA by using a specific custom instruction allows flexible and efficient implementation in hardware. The custom ISA extension allows for precise implementation of the instruction groups that the application needs as optimized hardware, maximizing performance while minimizing power. Examples for such processors that support custom ISA extension can be found in Tensilica Xtensa [1], Intel Nios [2], and RISC-V with its open-source ISA extensions [3]. B, and c specify the internal registers from which to read or to which to write This parameterization option can be enabled in both combinatorial and multicycle custom instructions

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call