Abstract
In this paper, we present a pipelined, high throughput single precision floating point implementation of the exponential function with a latency of 19 cycles. We also present the Application Specific Processor (ASP), a multiple memory architecture that increases memory bandwidth and overall performance of a computationally intensive application. The exponential function hardware unit is used as a function core or arithmetic unit of the ASP. Our experimental results show that executing a hardware implementation of the exponential function on a Field Programmable Gate Array (FPGA) is significantly faster than executing a software implementation on a multi-core processor. While the maximum clock rate of our FPGA board (200 MHz) is an order-of-magnitude slower than our multi-core processor (3.4 GHz), the FPGA-based hardware implementation of the exponential function is 29X and 8X faster than the multi-core processor-based software implementation and OpenMP implementation respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.