Abstract

This paper proposes an architecture inspired by ARM big.LITTLE that combines a hardened host with a cluster of soft processors of different complexities, performance and energy profiles. This coarse-grained FPGA overlay architecture results in a hardware accelerator that offers software like programmability, fast compilation, improved design productivity and application portability. A programming flow based on OpenCL is introduced to allow application programmers to implement parallel algorithms at higher level of abstractions. Current OpenCL tools for FPGAs suffer from long compilation times and limited compiler support. Minor changes to the algorithm normally mean full implementation cycles that can take several hours to complete. The proposed architecture allows changes to the application at run-time with cross-compilation done in the host during program execution. To compensate for the loss of performance compared with custom logic the FPGA cluster supports adaptive voltage scaling that enables higher clock frequencies and better adaptation to the program load. Experimental results demonstrates 70% improvement in computational time and 80% reduction in energy consumption by computing OpenCL kernel on different clusters and various operating voltages and frequencies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.