Abstract

This paper documents the research towards the analysis of different solutions to implement a Neural Network architecture on a FPGA design by using floating point accelerators. In particular, two different implementations are investigated: a high level solution to create a neural network on a soft processor design, with different strategies for enhancing the performance of the process; a low level solution, achieved by a cascade of floating point arithmetic elements. Comparisons of the achieved performance in terms of both time consumptions and FPGA resources employed for the architectures are presented.

Highlights

  • Field Programmable Gate Arrays (FPGA) designs are very common in the field of computational electronics [1], [2], [3]

  • Digital Signal Processing (DSP) models, often analyzed in high level environments, show heavy restraints on performance once implemented on embedded systems whose bottleneck is, despite the ongoing advances in Floating Point Units (FPU) development, the low floating point operations per second (FLOPS), [4]

  • The first solution attempted to implement the network on FPGA makes usage of the soft core processor Nios II/f, released by Altera R as a crypted core

Read more

Summary

Introduction

Field Programmable Gate Arrays (FPGA) designs are very common in the field of computational electronics [1], [2], [3]. In literature different approaches have been followed to reduce the computational cost of this particular activation function, using piecewise linear interpolation [11], polynomial fitting techniques [12], [13], [14], [15], enhanced computational algorithms [16], [17] and Look-Up Tables [18], [19], [20], [20], [21] In this way, customization allows the designer to implement blocks inside the FPGA to speed up the calculus of FP operations. Transferred in the custom logic and, according to the type of custom instruction (combinatorial or sequential) the result is collected after a definite number of clock cycles [26]

The Feed Forward Neural Network
Overall System Description
Polynomial Fitting
Test Results and Considerations
Data Flow of the Arithmetic Core
Time Machine FSM
Solutions Comparison
Conclusions and Future
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.