Abstract

This paper presents an accelerator-rich system-on-chip (SoC) architecture integrating many heterogeneous Coarse Grain Reconfigurable Arrays (CGRA) connected through a Network-on-Chip (NoC). The architecture is designed to maximize the reconfigurable processing capacity for the execution of massively parallel algorithms. The central node of the NoC contains a Reduced Instruction Set Computer (RISC) core that manages distribution of computing functions and data within the SoC while the other nodes contain CGRAs of application-specific sizes. Prior approaches coupled only a few accelerators with a RISC core using special instructions and/or a direct memory access device. In contrast, our design couples a RISC core to many CGRAs through the NoC. This approach provides for independent and simultaneous execution of multiple computing kernels. Furthermore, the proposed architecture mitigates power dissipation as CGRA sizes are tailored for the individual application kernels. We present a proof-of-concept design with a total of 408 reconfigurable processing elements. This instance and its sub-systems are customized and tested for different computationally-intensive signal processing algorithms. The overall single-chip computing system is synthesized for a Field Programmable Gate Array device. We present comparison to and evaluation against some of the existing multicore systems in terms of multiple performance metrics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.