Memory-Aware Functional IR for Higher-Level Synthesis of Accelerators

Christof Schlaak,Christophe Dubach,Tzung-Han Juang

doi:10.1145/3501768

Abstract

Specialized accelerators deliver orders of a magnitude of higher performance than general-purpose processors. The ever-changing nature of modern workloads is pushing the adoption of Field Programmable Gate Arrays (FPGAs) as the substrate of choice. However, FPGAs are hard to program directly using Hardware Description Languages (HDLs). Even modern high-level HDLs, e.g., Spatial and Chisel, still require hardware expertise. This article adopts functional programming concepts to provide a hardware-agnostic higher-level programming abstraction. During synthesis, these abstractions are mechanically lowered into a functional Intermediate Representation (IR) that defines a specific hardware design point. This novel IR expresses different forms of parallelism and standard memory features such as asynchronous off-chip memories or synchronous on-chip buffers. Exposing such features at the IR level is essential for achieving high performance. The viability of this approach is demonstrated on two stencil computations and by exploring the optimization space of matrix-matrix multiplication. Starting from a high-level representation for these algorithms, our compiler produces low-level VHSIC Hardware Description Language (VHDL) code automatically. Several design points are evaluated on an Intel Arria 10 FPGA, demonstrating the ability of the IR to exploit different hardware features. This article also shows that the designs produced are competitive with highly tuned OpenCL implementations and outperform hardware-agnostic OpenCL code.

Highlights

Designing new accelerators is a manual, time consuming, and error-prone process
Physical timing issues caused by long signal paths on the FPGA do not occur in simulation
We have shown that a multi-level functional IR is well suited to generate efficient FPGA implementations

Summary

Introduction

Current HDLs are not suitable for a rapid-development cycle and there is a lack of high-level languages and abstractions for efficient hardware design. Languages such as Spatial [18], Chisel [2] or OpenCL reduce the amount of boiler plate code required, but remain fairly low-level. Recent years have seen a push towards functional approaches for high-performance computing. Delite [36], Lift [34] and Futhark [12] have demonstrated that high-level abstractions and highperformance can go hand in hand. Lift-hls [19] and Aetherling [10] have demonstrated that the functional approach is viable for producing accelerators

Methods

Results

Conclusion