Abstract

Coarse-grained reconfigurable architectures (CGRAs) have emerged as promising accelerators due to their high flexibility and energy efficiency. However, existing open source works often lack integration of CGRAs with CPU systems and corresponding toolchains. Moreover, there is rare support for the accelerator instruction pipelining to overlap data communication, computation, and configuration across multiple tasks. In this article, we propose FDRA, an open source exploration framework for a heterogeneous system-on-chip (SoC) with a RISC-V processor and a dynamically reconfigurable accelerator (DRA) supporting loop, instruction, and task levels of parallelism. FDRA encompasses parameterized SoC modeling, Verilog generation, source-to-source application code transformation using frontend and DRA compilers, SoC simulation, and FPGA prototyping. FDRA incorporates the extraction of periodic accumulative operators and multi-dimensional linear load/store operators from nested loops. The DRA enables accessing the shared L2 cache with virtual addresses and supports direct memory access with arbitrary start addresses and data lengths. Integrated into the RISC-V Rocket SoC, our DRA achieves a remarkable 55× acceleration for loop kernels and improves energy efficiency by 29×. Compared to state-of-the-art RISC-V vector units, our DRA demonstrates a 2.9× speed improvement and 3.5× greater energy efficiency. In contrast to previous CGRA+RISC-V SoCs, our SoC achieves a minimum speedup of 5.2×.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call