Abstract

The integration of FPGA-based accelerators into a complete heterogeneous system is a challenging task faced by many researchers and engineers, especially now that FPGAs enjoy increasing popularity as implementation platforms for efficient, application-specific accelerators for domains such as signal processing, machine learning and intelligent storage. To lighten the burden of system integration from the developers of accelerators, the open-source TaPaSCo framework presented in this work provides an automated toolflow for the construction of heterogeneous many-core architectures from custom processing elements, and a simple, uniform programming interface to utilize spatially distributed, parallel computation on FPGAs. TaPaSCo aims to increase the scalability and portability of FPGA designs through automated design space exploration, greatly simplifying the scaling of hardware designs and facilitating iterative growth and portability across FPGA devices and families. This work describes TaPaSCo with its primary design abstractions and shows how TaPaSCo addresses portability and extensibility of FPGA hardware designs for systems-on-chip. A study of successful projects using TaPaSCo shows its versatility and can serve as inspiration and reference for future users, with more details on the usage of TaPaSCo presented in an in-depth case study and a short overview of the workflow.

Highlights

  • Compared to modern software development methods it has been and still is very hard to achieve scalability and portability for FPGA-based solutions.While microprocessor instruction set architectures have become commodity and are nowadays mostly interchangeable due to high-level software programming abstractions and powerful compilers, FPGA development is still very close to the metal

  • This paper presents TaPaSCo, the Task Parallel Systems Composer, an open source toolchain addressing these challenges

  • Whereas TaPaSCo supports both high level synthesis (HLS) and HDLs to define the accelerator cores, CMOST [78] generates a full system architecture from a C program by revisiting approaches found in HLS compilers [9, 11, 21]: It extends the loop unrolling and extraction techniques based on the polyhedral model to task based parallel computing, with similar split into a hardware-dependent and a hardwareindependent part of the generated designs, support for SoCs and some PCIe-based systems (e.g., Xilinx VC707)

Read more

Summary

Introduction

Compared to modern software development methods it has been and still is very hard to achieve scalability and portability for FPGA-based solutions. TaPaSCo consists of a scriptable toolflow for the automated construction of heterogeneous, many-core System-on-Chip hardware architectures, and a set of APIs to facilitate task parallel computing on TaPaSCo FPGA accelerator designs. It shall be noted that its main contribution is not in either of these fields — instead, TaPaSCo aims to harness and amplify the power of existing tools and approaches by providing the missing glue between state of the art HLS tools and modern parallel computing paradigms and languages: It allows the designer of FPGA accelerators to raise their level of abstraction and disregard many specific features of the target FPGA by delegation of optimizing these choices to TaPaSCo’s automated design space exploration.

Related Work
Earlier Works
Recent Works
Commercial Tools
TaPaSCo
Hardware Design Abstractions
Software Design Abstractions
Portability
Scalability
Extensibility
TaPaSCo Success Stories
Integration of High-Level Synthesis
Custom HDL-based Accelerators
Softcores
Case Study
RISC-V Processing Elements
PE Local Memory
TaPaSCo FPGA Composition
Application Development with TaPaSCo
Scaling to Larger Devices with TaPaSCo
Future Work
Conclusion
34. Intel Corporation
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call